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For as the heavens are higher than the earth, so are my ways higher than 
your ways, and my thoughts than your thoughts, says the Lord. 
Isaiah 55:9 


Preface 


Ideas from quantum physics play important roles in many parts of modern 
mathematics. Many parts of representation theory, for example, are moti- 
vated by quantum mechanics, including the Wigner—Mackey theory of in- 
duced representations, the Kirillov-Kostant orbit method, and, of course, 
quantum groups. The Jones polynomial in knot theory, the Gromov—Witten 
invariants in topology, and mirror symmetry in algebraic topology are other 
notable examples. The awarding of the 1990 Fields Medal to Ed Witten, a 
physicist, gives an idea of the scope of the influence of quantum theory in 
mathematics. 

Despite the importance of quantum mechanics to mathematics, there is 
no easy way for mathematicians to learn the subject. Quantum mechan- 
ics books in the physics literature are generally not easily understood by 
most mathematicians. There is, of course, a lower level of mathematical 
precision in such books than mathematicians are accustomed to. In addi- 
tion, physics books on quantum mechanics assume knowledge of classical 
mechanics that mathematicians often do not have. And, finally, there is a 
subtle difference in “culture”—differences in terminology and notation— 
that can make reading the physics literature like reading a foreign language 
for the mathematician. There are few books that attempt to translate quan- 
tum theory into terms that mathematicians can understand. 

This book is intended as an introduction to quantum mechanics for math- 
ematicians with little prior exposure to physics. The twin goals of the book 
are (1) to explain the physical ideas of quantum mechanics in language 
mathematicians will be comfortable with, and (2) to develop the neces- 
sary mathematical tools to treat those ideas in a rigorous fashion. I have 
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attempted to give a reasonably comprehensive treatment of nonrelativistic 
quantum mechanics, including topics found in typical physics texts (e.g., 
the harmonic oscillator, the hydrogen atom, and the WKB approximation) 
as well as more mathematical topics (e.g., quantization schemes, the Stone— 
von Neumann theorem, and geometric quantization). I have also attempted 
to minimize the mathematical prerequisites. I do not assume, for example, 
any prior knowledge of spectral theory or unbounded operators, but pro- 
vide a full treatment of those topics in Chaps.6 through 10 of the text. 
Similarly, I do not assume familiarity with the theory of Lie groups and 
Lie algebras, but provide a detailed account of those topics in Chap. 16. 
Whenever possible, I provide full proofs of the stated results. 

Most of the text will be accessible to graduate students in mathematics 
who have had a first course in real analysis, covering the basics of L? spaces 
and Hilbert spaces. Appendix A reviews some of the results that are used in 
the main body of the text. In Chaps. 21 and 23, however, I assume knowl- 
edge of the theory of manifolds. I have attempted to provide motivation for 
many of the definitions and proofs in the text, with the result that there 
is a fair amount of discussion interspersed with the standard definition- 
theorem-proof style of mathematical exposition. There are exercises at the 
end of each chapter, making the book suitable for graduate courses as well 
as for independent study. 

In comparison to the present work, classics such as Reed and Simon [34] 
and Glimm and Jaffe [14], along with the recent book of Schmiidgen [35], 
are more focused on the mathematical underpinnings of the theory than 
on the physical ideas. Hannabuss’s text [22] is fairly accessible to math- 
ematicians, but—despite the word “graduate” in the title of the series— 
uses an undergraduate level of mathematics. The recent book of Takhtajan 
[39], meanwhile, has an expository bent to it, but provides less physical 
motivation and is less self-contained than the present book. Whereas, for 
example, Takhtajan begins with Lagrangian and Hamiltonian mechanics 
on manifolds, I begin with “low-tech” classical mechanics on the real line. 
Similarly, Takhtajan assumes knowledge of unbounded operators and Lie 
groups, while I provide substantial expositions of both of those subjects. 
Finally, there is the work of Folland [13], which I highly recommend, but 
which deals with quantum field theory, whereas the present book treats 
only nonrelativistic quantum mechanics, except for a very brief discussion 
of quantum field theory in Sect. 20.6. 

The book begins with a quick introduction to the main ideas of classical 
and quantum mechanics. After a brief account in Chap. 1 of the historical 
origins of quantum theory, I turn in Chap. 2 to a discussion of the neces- 
sary background from classical mechanics. This includes Newton’s equa- 
tion in varying degrees of generality, along with a discussion of important 
physical quantities such as energy, momentum, and angular momentum, 
and conditions under which these quantities are “conserved” (i.e., constant 
along each solution of Newton’s equation). I give a short treatment here 
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of Poisson brackets and Hamilton’s form of Newton’s equation, deferring a 
full discussion of “fancy” classical mechanics to Chap. 21. 

In Chap. 3, I attempt to motivate the structures of quantum mechanics in 
the simplest setting. Although I discuss the “axioms” (in standard physics 
terminology) of quantum mechanics, I resolutely avoid a strictly axiomatic 
approach to the subject (using, say, C*-algebras). Rather, I try to provide 
some motivation for the position and momentum operators and the Hilbert 
space approach to quantum theory, as they connect to the probabilistic as- 
pect of the theory. I do not attempt to explain the strange probabilistic 
nature of quantum theory, if, indeed, there is any explanation of it. Rather, 
I try to elucidate how the wave function, along with the position and mo- 
mentum operators, encodes the relevant probabilities. 

In Chaps. 4 and 5, we look into two illustrative cases of the Schrodinger 
equation in one space dimension: a free particle and a particle in a square 
well. In these chapters, we encounter such important concepts as the dis- 
tinction between phase velocity and group velocity and the distinction be- 
tween a discrete and a continuous spectrum. 

In Chaps. 6 through 10, we look into some of the technical mathematical 
issues that are swept under the carpet in earlier chapters. I have tried to 
design this section of the book in such a way that a reader can take in as 
much or as little of the mathematical details as desired. For a reader who 
simply wants the big picture, I outline the main ideas and results of spec- 
tral theory in Chap. 6, including a discussion of the prototypical example 
of an operator with a continuous spectrum: the momentum operator. For 
a reader who wants more information, I provide statements of the spec- 
tral theorem (in two different forms) for bounded self-adjoint operators in 
Chap. 7, and an introduction to the notion of unbounded self-adjoint op- 
erators in Chap. 9. Finally, for the reader who wants all the details, I give 
proofs of the spectral theorem for bounded and unbounded self-adjoint 
operators, in Chaps. 8 and 10, respectively. 

In Chaps. 11 through 14, we turn to the vitally important canonical com- 
mutation relations. These are used in Chap. 11 to derive algebraically the 
spectrum of the quantum harmonic oscillator. In Chap. 12, we discuss the 
uncertainty principle, both in its general form (for arbitrary pairs of non- 
commuting operators) and in its specific form (for the position and momen- 
tum operators). We pay careful attention to subtle domain issues that are 
usually glossed over in the physics literature. In Chap. 13, we look at differ- 
ent “quantization schemes” (i.e., different ways of ordering products of the 
noncommuting position and momentum operators). In Chap. 14, we turn to 
the celebrated Stone-von Neumann theorem, which provides a uniqueness 
result for representations of the canonical commutation relations. As in the 
case of the uncertainty principle, there are some subtle domain issues here 
that require attention. 

In Chaps. 15 through 18, we examine some less elementary issues in quan- 
tum theory. Chapter 15 addresses the WKB (Wentzel-Kramers-Brillouin) 
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approximation, which gives simple but approximate formulas for the eigen- 
vectors and eigenvalues for the Hamiltonian operator in one dimension. 
After this, we introduce (Chap.16) the notion of Lie groups, Lie alge- 
bras, and their representations, all of which play an important role in 
many parts of quantum mechanics. In Chap. 17, we consider the example 
of angular momentum and spin, which can be understood in terms of the 
representations of the rotation group SO(3). Here a more mathematical 
approach—especially the relationship between Lie group representations 
and Lie algebra representations—can substantially clarify a topic that is 
rather mysterious in the physics literature. In particular, the concept of 
“fractional spin” can be understood as describing a representation of the 
Lie algebra of the rotation group for which there is no associated represen- 
tation of the rotation group itself. In Chap. 18, we illustrate these ideas by 
describing the energy levels of the hydrogen atom, including a discussion 
of the hidden symmetries of hydrogen, which account for the “accidental 
degeneracy” in the levels. In Chap. 19, we look more closely at the concept 
of the “state” of a system in quantum mechanics. We look at the notion 
of subsystems of a quantum system in terms of tensor products of Hilbert 
spaces, and we see in this setting that the notion of “pure state” (a unit 
vector in the relevant Hilbert space) is not adequate. We are led, then, to 
the notion of a mixed state (or density matrix). We also examine the idea 
that, in quantum mechanics, “identical particles are indistinguishable.” 

Finally, in Chaps. 21 through 23, we examine some advanced topics in 
classical and quantum mechanics. We begin, in Chap. 20, by considering the 
path integral formulation of quantum mechanics, both from the heuristic 
perspective of the Feynman path integral, and from the rigorous perspective 
of the Feynman—Kac formula. Then, in Chap. 21, we give a brief treatment 
of Hamiltonian mechanics on manifolds. Finally, we consider the machinery 
of geometric quantization, beginning with the Euclidean case in Chap. 22 
and continuing with the general case in Chap. 23. 

I am grateful to all who have offered suggestions or made corrections 
to the manuscript, including Renato Bettiol, Edward Burkard, Matt Cecil, 
Tiancong Chen, Bo Jacoby, Will Kirwin, Nicole Kroeger, Wicharn Lewkeer- 
atiyutkul, Jeff Mitchell, Eleanor Pettus, Ambar Sengupta, and Augusto 
Stoffel. I am particularly grateful to Michel Talagrand who read almost 
the entire manuscript and made numerous corrections and suggestions. Fi- 
nally, I offer a special word of thanks to my advisor and friend, Leonard 
Gross, who started me on the path toward understanding the mathemati- 
cal foundations of quantum mechanics. Readers are encouraged to send me 
comments or corrections at bhall@nd.edu. 


Notre Dame, IN, USA Brian C. Hall 
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The Experimental Origins of Quantum 
Mechanics 


Quantum mechanics, with its controversial probabilistic nature and curious 
blending of waves and particles, is a very strange theory. It was not 
invented because anyone thought this is the way the world should behave, 
but because various experiments showed that this is the way the world 
does behave, like it or not. Craig Hogan, director of the Fermilab Particle 
Astrophysics Center, put it this way: 


No theorist in his right mind would have invented quantum 
mechanics unless forced to by data.! 


Although the first hint of quantum mechanics came in 1900 with Planck’s 
solution to the problem of blackbody radiation, the full theory did not 
emerge until 1925-1926, with Heisenberg’s matrix model, Schrédinger’s 
wave model, and Born’s statistical interpretation of the wave model. 


1.1 Is Light a Wave or a Particle? 


1.1.1 Newton Versus Huygens 


Beginning in the late seventeenth century and continuing into the early 
eighteenth century, there was a vigorous debate in the scientific community 


1Quoted in “Is Space Digital?” by Michael Moyer, Scientific American, February 
2012, pp. 30-36. 
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2 1. The Experimental Origins of Quantum Mechanics 


over the nature of light. One camp, following the views of Isaac 
Newton, claimed that light consisted of a group of particles or “corpus- 
cles.” The other camp, led by the Dutch physicist Christiaan Huygens, 
claimed that light was a wave. Newton argued that only a corpuscular the- 
ory could account for the observed tendency of light to travel in straight 
lines. Huygens and others, on the other hand, argued that a wave theory 
could explain numerous observed aspects of light, including the bending 
or “refraction” of light as it passes from one medium to another, as from 
air into water. Newton’s reputation was such that his “corpuscular” theory 
remained the dominant one until the early nineteenth century. 


1.1.2 The Ascendance of the Wave Theory of Light 


In 1804, Thomas Young published two papers describing and explaining 
his double-slit experiment. In this experiment, sunlight passes through a 
small hole in a piece of cardboard and strikes another piece of cardboard 
containing two small holes. The light then strikes a third piece of cardboard, 
where the pattern of light may be observed. Young observed “fringes” or 
alternating regions of high and low intensity for the light. Young believed 
that light was a wave and he postulated that these fringes were the result 
of interference between the waves emanating from the two holes. Young 
drew an analogy between light and water, where in the case of water, 
interference is readily observed. If two circular waves of water cross each 
other, there will be some points where a peak of one wave matches up with 
a trough of another wave, resulting in destructive interference, that is, a 
partial cancellation between the two waves, resulting in a small amplitude 
of the combined wave at that point. At other points, on the other hand, a 
peak in one wave will line up with a peak in the other, or a trough with 
a trough. At such points, there is constructive interference, with the result 
that the amplitude of the combined wave is large at that point. The pattern 
of constructive and destructive interference will produce something like a 
checkerboard pattern of alternating regions of large and small amplitudes 
in the combined wave. The dimensions of each region will be roughly on 
the order of the wavelength of the individual waves. 

Based on this analogy with water waves, Young was able to explain the 
interference fringes that he observed and to predict the wavelength that 
light must have in order for the specific patterns he observed to occur. 
Based on his observations, Young claimed that the wavelength of visible 
light ranged from about 1/36,000 in. (about 700nm) at the red end of the 
spectrum to about 1/60,000 in. (about 425nm) at the violet end of the 
spectrum, results that agree with modern measurements. 

Figure 1.1 shows how circular waves emitted from two different points 
form an interference pattern. One should think of Young’s second piece of 
cardboard as being at the top of the figure, with holes near the top left and 
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FIGURE 1.1. Interference of waves emitted from two slits. 


top right of the figure. Figure 1.2 then plots the intensity (i.e., the square of 
the displacement) as a function of x, with y having the value corresponding 
to the bottom of Fig. 1.1. 

Despite the convincing nature of Young’s experiment, many proponents 
of the corpuscular theory of light remained unconvinced. In 1818, the 
French Academy of Sciences set up a competition for papers explaining 
the observed properties of light. One of the submissions was a paper by 
Augustin-Jean Fresnel in which he elaborated on Huygens’s wave model 
of refraction. A supporter of the corpuscular theory of light, Siméon-Denis 
Poisson read Fresnel’s submission and ridiculed it by pointing out that 
if that theory were true, light passing by an opaque disk would diffract 
around the edges of the disk to produce a bright spot in the center of the 
shadow of the disk, a prediction that Poisson considered absurd. Never- 
theless, the head of the judging committee for the competition, Francois 
Arago, decided to put the issue to an experimental test and found that 
such a spot does in fact occur. Although this spot is often called “Arago’s 
spot,” or even, ironically, “Poisson’s spot,” Arago eventually realized that 
the spot had been observed 100 years earlier in separate experiments by 
Delisle and Maraldi. 

Arago’s observation of Poisson’s spot led to widespread acceptance of 
the wave theory of light. This theory gained even greater acceptance in 
1865, when James Clerk Maxwell put together what are today known as 
Maxwell’s equations. Maxwell showed that his equations predicted that 
electromagnetic waves would propagate at a certain speed, which agreed 
with the observed speed of light. Maxwell thus concluded that light is sim- 
ply an electromagnetic wave. From 1865 until the end of the nineteenth 
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FIGURE 1.2. Intensity plot for a horizontal line across the bottom of Fig. 1.1 








century, the debate over the wave-versus-particle nature of light was con- 
sidered to have been conclusively settled in favor of the wave theory. 


1.1.8 Blackbody Radiation 


In the early twentieth century, the wave theory of light began to experience 
new challenges. The first challenge came from the theory of blackbody radia- 
tion. In physics, a blackbody is an idealized object that perfectly absorbs all 
electromagnetic radiation that hits it. A blackbody can be approximated in 
the real world by an object with a highly absorbent surface such as “lamp 
black.” The problem of blackbody radiation concerns the distribution of 
electromagnetic radiation in a cavity within a blackbody. Although the 
walls of the blackbody absorb the radiation that hits it, thermal vibrations 
of the atoms making up the walls cause the blackbody to emit electromag- 
netic radiation. (At normal temperatures, most of the radiation emitted 
would be in the infrared range.) 

In the cavity, then, electromagnetic radiation is constantly absorbed and 
re-emitted until thermal equilibrium is reached, at which point the absorp- 
tion and emission of radiation are perfectly balanced at each frequency. 
According to the “equipartition theorem” of (classical) statistical mechan- 
ics, the energy in any given mode of electromagnetic radiation should be 
exponentially distributed, with an average value equal to kgT, where T is 
the temperature and kg is Boltzmann’s constant. (The temperature should 
be measured on a scale where absolute zero corresponds to T = 0.) The dif- 
ficulty with this prediction is that the average amount of energy is the same 
for every mode (hence the term “equipartition”). Thus, once one adds up 
over all modes—of which there are infinitely many—the predicted amount 
of energy in the cavity is infinite. This strange prediction is referred to as 
the ultraviolet catastrophe, since the infinitude of the energy comes from the 
ultraviolet (high-frequency) end of the spectrum. This ultraviolet catastro- 
phe does not seem to make physical sense and certainly does not match up 
with the observed energy spectrum within real-world blackbodies. 
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An alternative prediction of the blackbody energy spectrum was offered 
by Max Planck in a paper published in 1900. Planck postulated that 
the energy in the electromagnetic field at a given frequency w should be 
“quantized,” meaning that this energy should come only in integer mul- 
tiples of a certain basic unit equal to fw, where fi is a constant, which 
we now call Planck’s constant. Planck postulated that the energy would 
again be exponentially distributed, but only over integer multiples of hw. 
At low frequencies, Planck’s theory predicts essentially the same energy as 
in classical statistical mechanics. At high frequencies, namely at frequen- 
cies where fiw is large compared to kgT, Planck’s theory predicts a rapid 
fall-off of the average energy (see Exercise 2 for details). Indeed, if we mea- 
sure mass, distance, and time in units of grams, centimeters, and seconds, 
respectively, and we assign fi the numerical value 


h = 1.054 x 107”, 


then Planck’s predictions match the experimentally observed blackbody 
spectrum. 

Planck pictured the walls of the blackbody as being made up of inde- 
pendent oscillators of different frequencies, each of which is restricted to 
have energies of fw. Although this picture was clearly not intended as a 
realistic physical explanation of the quantization of electromagnetic energy 
in blackbodies, it does suggest that Planck thought that energy quantiza- 
tion arose from properties of the walls of the cavity, rather than in intrinsic 
properties of the electromagnetic radiation. Einstein, on the other hand, in 
assessing Planck’s model, argued that energy quantization was inherent in 
the radiation itself. In Einstein’s picture, then, electromagnetic energy at 
a given frequency—whether in a blackbody cavity or not—comes in pack- 
ets or quanta having energy proportional to the frequency. Each quantum 
of electromagnetic energy constitutes what we now call a photon, which 
we may think of as a particle of light. Thus, Planck’s model of blackbody 
radiation began a rebirth of the particle theory of light. 

It is worth mentioning, in passing, that in 1900, the same year in which 
Planck’s paper on blackbody radiation appeared, Lord Kelvin gave a lec- 
ture that drew attention to another difficulty with the classical theory 
of statistical mechanics. Kelvin described two “clouds” over nineteenth- 
century physics at the dawn of the twentieth century. The first of these 
clouds concerned aether—a hypothetical medium through which electro- 
magnetic radiation propagates—and the failure of Michelson and Morley to 
observe the motion of earth relative to the aether. Under this cloud lurked 
the theory of special relativity. The second of Kelvin’s clouds concerned 
heat capacities in gases. The equipartition theorem of classical statisti- 
cal mechanics made predictions for the ratio of heat capacity at constant 
pressure (cp) and the heat capacity at constant volume (c,). These pre- 
dictions deviated substantially from the experimentally measured ratios. 
Under the second cloud lurked the theory of quantum mechanics, because 
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the resolution of this discrepancy is similar to Planck’s resolution of the 
blackbody problem. As in the case of blackbody radiation, quantum me- 
chanics gives rise to a correction to the equipartition theorem, thus result- 
ing in different predictions for the ratio of cp to c,, predictions that can be 
reconciled with the observed ratios. 


1.1.4. The Photoelectric Effect 


The year 1905 was Einstein’s annus mirabilis (miraculous year), in which 
Einstein published four ground-breaking papers, two on the special theory 
of relativity and one each on Brownian motion and the photoelectric effect. 
It was for the photoelectric effect that Einstein won the Nobel Prize in 
physics in 1921. In the photoelectric effect, electromagnetic radiation strik- 
ing a metal causes electrons to be emitted from the metal. Einstein found 
that as one increases the intensity of the incident light, the number of emit- 
ted electrons increases, but the energy of each electron does not change. 
This result is difficult to explain from the perspective of the wave theory of 
light. After all, if light is simply an electromagnetic wave, then increasing 
the intensity of the light amounts to increasing the strength of the electric 
and magnetic fields involved. Increasing the strength of the fields, in turn, 
ought to increase the amount of energy transferred to the electrons. 

Einstein’s results, on the other hand, are readily explained from a particle 
theory of light. Suppose light is actually a stream of particles (photons) with 
the energy of each particle determined by its frequency. Then increasing 
the intensity of light at a given frequency simply increases the number of 
photons and does not affect the energy of each photon. If each photon has 
a certain likelihood of hitting an electron and causing it to escape from 
the metal, then the energy of the escaping electron will be determined 
by the frequency of the incident light and not by the intensity of that 
light. The photoelectric effect, then, provided another compelling reason 
for believing that light can behave in a particlelike manner. 


1.1.5 The Double-Slit Experiment, Revisited 


Although the work of Planck and Einstein suggests that there is a par- 
ticlelike aspect to light, there is certainly also a wavelike aspect to light, 
as shown by Young, Arago, and Maxwell, among others. Thus, somehow, 
light must in some situations behave like a wave and in some situations 
like a particle, a phenomenon known as “wave-particle duality.” William 
Lawrence Bragg described the situation thus: 


God runs electromagnetics on Monday, Wednesday, and Friday 
by the wave theory, and the devil runs them by quantum theory 
on Tuesday, Thursday, and Saturday. 


(Apparently Sunday, being a day of rest, did not need to be accounted for.) 
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In particular, we have already seen that Young’s double-slit experiment 
in the early nineteenth century was one important piece of evidence in fa- 
vor of the wave theory of light. If light is really made up of particles, as 
blackbody radiation and the photoelectric effect suggest, one must give a 
particle-based explanation of the double-slit experiment. J.J. Thomson sug- 
gested in 1907 that the patterns of light seen in the double-slit experiment 
could be the result of different photons somehow interfering with one an- 
other. Thomson thus suggested that if the intensity of light were sufficiently 
reduced, the photons in the light would become widely separated and the 
interference pattern might disappear. In 1909, Geoffrey Ingram Taylor set 
out to test this suggestion and found that even when the intensity of light 
was drastically reduced (to the point that it took three months for one of 
the images to form), the interference pattern remained the same. 

Since Taylor’s results suggest that interference remains even when the 
photons are widely separated, the photons are not interfering with one an- 
other. Rather, as Paul Dirac put it in Chap. 1 of [6], “Each photon then 
interferes only with itself.” To state this in a different way, since there is no 
interference when there is only one slit, Taylor’s results suggest that each 
individual photon passes through both slits. By the early 1960s, it became 
possible to perform double-slit experiments with electrons instead of pho- 
tons, yielding even more dramatic confirmations of the strange behavior of 
matter in the quantum realm. (See Sect. 1.2.4.) 


1.2 Is an Electron a Wave or a Particle? 


In the early part of the twentieth century, the atomic theory of matter 
became firmly established. (Einstein’s 1905 paper on Brownian motion was 
an important confirmation of the theory and provided the first calculation 
of atomic masses in everyday units.) Experiments performed in 1909 by 
Hans Geiger and Ernest Marsden, under the direction of Ernest Rutherford, 
led Rutherford to put forward in 1911 a picture of atoms in which a small 
nucleus contains most of the mass of the atom. In Rutherford’s model, 
each atom has a positively charged nucleus with charge nq, where n is 
a positive integer (the atomic number) and q is the basic unit of charge 
first observed in Millikan’s famous oil-drop experiment. Surrounding the 
nucleus is a cloud of n electrons, each having negative charge —q. When 
atoms bind into molecules, some of the electrons of one atom may be shared 
with another atom to form a bond between the atoms. This picture of atoms 
and their binding led to the modern theory of chemistry. 

Basic to the atomic theory is that electrons are particles; indeed, the 
number of electrons per atom is supposed to be the atomic number. Never- 
theless, it did not take long after the atomic theory of matter was confirmed 
before wavelike properties of electrons began to be observed. The situation, 
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then, is the reverse of that with light. While light was long thought to be 
a wave (at least from the publication of Maxwell’s equations in 1865 until 
Planck’s work in 1900) and was only later seen to have particlelike behavior, 
electrons were initially thought to be particles and were only later seen to 
have wavelike properties. In the end, however, both light and electrons have 
both wavelike and particlelike properties. 


1.2.1 The Spectrum of Hydrogen 


If electricity is passed through a tube containing hydrogen gas, the gas will 
emit light. If that light is separated into different frequencies by means 
of a prism, bands will become apparent, indicating that the light is not a 
continuous mix of many different frequencies, but rather consists only of a 
discrete family of frequencies. In view of the photonic theory of light, the 
energy in each photon is proportional to its frequency. Thus, each observed 
frequency corresponds to a certain amount of energy being transferred from 
a hydrogen atom to the electromagnetic field. 

Now, a hydrogen atom consists of a single proton surrounded by a single 
electron. Since the proton is much more massive than the electron, one 
can picture the proton as being stationary, with the electron orbiting it. 
The idea, then, is that the current being passed through the gas causes some 
of the electrons to move to a higher-energy state. Eventually, that electron 
will return to a lower-energy state, emitting a photon in the process. In this 
way, by observing the energies (or, equivalently, the frequencies) of the 
emitted photons, one can work backwards to the change in energy of the 
electron. 

The curious thing about the state of affairs in the preceding paragraph 
is that the energies of the emitted photons—and hence, also, the energies 
of the electron—come only in a discrete family of possible values. Based 
on the observed frequencies, Johannes Rydberg concluded in 1888 that the 
possible energies of the electron were of the form 


R 
n 
Here, R is the “Rydberg constant,” given (in “Gaussian units”) by 


_ MeQ* 
Qn ° 





where Q is the charge of the electron and m, is the mass of the electron. 
(Technically, m. should be replaced by the reduced mass p of the proton— 
electron system; that is, ~ = mem,/(me + mp), where m, is the mass 
of the proton. However, since the proton mass is much greater than the 
electron mass, pis almost the same as m- and we will neglect the difference 
between the two.) The energies in (1.1) agree with experiment, in that all 


1.2 Is an Electron a Wave or a Particle? 9 


the observed frequencies in hydrogen are (at least to the precision available 
at the time of Rydberg) of the form 


1 
w = = (En — Em), (1.2) 


for some n > m. It should be noted that Johann Balmer had already 
observed in 1885 frequencies of the same form, but only in the case m = 2, 
and that Balmer’s work influenced Rydberg. 

The frequencies in (1.2) are known as the spectrum of hydrogen. Balmer 
and Rydberg were merely attempting to find a simple formula that would 
match the observed frequencies in hydrogen. Neither of them had a the- 
oretical explanation for why only these particular frequencies occur. Such 
an explanation would have to wait until the beginnings of quantum theory 
in the twentieth century. 


1.2.2 The Bohr-de Broglie Model of the Hydrogen Atom 


In 1913, Niels Bohr introduced a model of the hydrogen atom that at- 
tempted to explain the observed spectrum of hydrogen. Bohr pictured the 
hydrogen atom as consisting of an electron orbiting a positively charged 
nucleus, in much the same way that a planet orbits the sun. Classically, 
the force exerted on the electron by the proton follows the inverse square 
law of the form 


=— (1.3) 
where Q is the charge of the electron, in appropriate units. 


If the electron is in a circular orbit, its trajectory in the plane of the 
orbit will take the form 


(x(t), y(t)) = (rcos(wt), r sin(wt)). 


If we take the second derivative with respect to time to obtain the acceler- 
ation vector a, we obtain 


a(t) = (—w?r cos(wt), —w?r sin(wt)), 


so that the magnitude of the acceleration vector is w?r. Newton’s second 
law, F = ma, then requires that 


€ 
2 
Mewor = = 


so that 
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From the formula for the frequency, we can calculate that the momentum 
(mass times velocity) has magnitude 


== ~ 


We can also calculate the angular momentum J, which for a circular orbit 
is just the momentum times the distance from the nucleus, as 


j= J/m.Q?r. 


Bohr postulated that the electron obeys classical mechanics, except that 
its angular momentum is “quantized.” Specifically, in Bohr’s model, the 
angular momentum is required to be an integer multiple of h (Planck’s 
constant). Setting J equal to nh yields 


n?h? 


— MeQ?” 


(1.5) 


Tn 


If one calculates the energy of an orbit with radius r,,, one finds (Exercise 3) 
that it agrees precisely with the Rydberg energies in (1.1). Bohr further 
postulated that an electron could move from one allowed state to another, 
emitting a packet of light in the process with frequency given by (1.2). 

Bohr did not explain why the angular momentum of an electron is quan- 
tized, nor how it moved from one allowed orbit to another. As such, his 
theory of atomic behavior was clearly not complete; it belongs to the “old 
quantum mechanics” that was superseded by the matrix model of Heisen- 
berg and the wave model of Schrodinger. Nevertheless, Bohr’s model was an 
important step in the process of understanding the behavior of atoms, and 
Bohr was awarded the 1922 Nobel Prize in physics for his work. Some rem- 
nant of Bohr’s approach survives in modern quantum theory, in the WKB 
approximation (Chap. 15), where the Bohr-Sommerfeld condition gives an 
approximation to the energy levels of a one-dimensional quantum system. 

In 1924, Louis de Broglie reinterpreted Bohr’s condition on the angular 
momentum as a wave condition. The de Broglie hypothesis is that an elec- 
tron can be described by a wave, where the spatial frequency k of the wave 
is related to the momentum of the electron by the relation 


p=hk. (1.6) 


Here, “frequency” is defined so that the frequency of the function cos(kz) 
is k. This is “angular” frequency, which differs by a factor of 27 from the 
cycles-per-unit-distance frequency. Thus, the period associated with a given 
frequency k is 27/k. 

In de Broglie’s approach, we are supposed to imagine a wave super- 
imposed on the classical trajectory of the electron, with the quantization 


1.2 Is an Electron a Wave or a Particle? 11 


i 
S 





SZ 





FIGURE 1.3. The Bohr radii for n = 1 to n = 10, with de Broglie waves super- 
imposed for n = 8 and n = 10. 


condition now being that the wave should match up with itself when going 
all the way around the orbit. This condition means that the orbit should 
consist of an integer number of periods of the wave: 
2 
arr = no, 


k 
Using (1.6) along with the expression (1.4) for p, we obtain 


2rr = ie = 2rnh a 
meQ? 
Solving this equation for r gives precisely the Bohr radii in (1.5). 

Thus, de Broglie’s wave hypothesis gives an alternative to Bohr’s quan- 
tization of angular momentum as an explanation of the allowed energies of 
hydrogen. Of course, if one accepts de Broglie’s wave hypothesis for elec- 
trons, one would expect to see wavelike behavior of electrons not just in the 
hydrogen atom, but in other situations as well, an expectation that would 
soon be fulfilled. Figure 1.3 shows the first 10 Bohr radii. For the 8th and 
10th radii, the de Broglie wave is shown superimposed onto the orbit. 


1.2.38 Electron Diffraction 


In 1925, Clinton Davisson and Lester Germer were studying properties of 
nickel by bombarding a thin film of nickel with low-energy electrons. As a 
result of a problem with their equipment, the nickel was accidentally heated 
to a very high temperature. When the nickel cooled, it formed into large 
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crystalline pieces, rather than the small crystals in the original sample. 
After this recrystallization, Davisson and Germer observed peaks in the 
pattern of electrons reflecting off of the nickel sample that had not been 
present when using the original sample. They were at a loss to explain this 
pattern until, in 1926, Davisson learned of the de Broglie hypothesis and 
suspected that they were observing the wavelike behavior of electrons that 
de Broglie had predicted. 

After this realization, Davisson and Germer began to look systemati- 
cally for wavelike peaks in their experiments. Specifically, they attempted 
to show that the pattern of angles at which the electrons reflected matched 
the patterns one sees in x-ray diffraction. After numerous additional mea- 
surements, they were able to show a very close correspondence between 
the pattern of electrons and the patterns seen in x-ray diffraction. Since 
x-rays were by this time known to be waves of electromagnetic radiation, 
the Davisson—Germer experiment was a strong confirmation of de Broglie’s 
wave picture of electrons. Davisson and Germer published their results in 
two papers in 1927, and Davisson shared the 1937 Nobel Prize in physics 
with George Paget, who had observed electron diffraction shortly after 
Davisson and Germer. 


1.2.4 The Double-Slit Experiment with Electrons 


Although quantum theory clearly predicts that electrons passing through 
a double slit will experience interference similar to that observed in light, 
it was not until Clauss Jonsson’s work in 1961 that this prediction was 
confirmed experimentally. The main difficulty is the much smaller wave- 
length for electrons of reasonable energy than for visible light. Jonsson’s 
electrons, for example, had a de Broglie wavelength of 5nm, as compared to 
a wavelength of roughly 500nm for visible light (depending on the color). 
In results published in 1989, a team led by Akira Tonomura at Hitachi 
performed a double-slit experiment in which they were able to record the 
results one electron at a time. (Similar but less definitive experiments were 
carried out by Pier Giorgio Merli, GianFranco Missiroli and Giulio Pozzi 
in Bologna in 1974 and published in the American Journal of Physics in 
1976.) In the Hitachi experiment, each electron passes through the slits and 
then strikes a screen, causing a small spot of light to appear. The location of 
this spot is then recorded for each electron, one at a time. The key point is 
that each individual electron strikes the screen at a single point. That is to 
say, individual electrons are not smeared out across the screen in a wavelike 
pattern, but rather behave like point particles, in that the observed location 
of the electron is indeed a point. Each electron, however, strikes the screen 
at a different point, and once a large number of the electrons have struck 
and their locations have been recorded, an interference pattern emerges. 
It is not the variability of the locations of the electrons that is surprising, 
since this could be accounted for by small variations in the way the electrons 
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FIGURE 1.4. Four images from the 1989 experiment at Hitachi showing the 
impact of individual electrons gradually building up to form an interference pat- 
tern. Image by Akira Tonomura and Wikimedia Commons user Belsazar. File 
is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported 
license. 


are shot toward the slits. Rather, it is the distinctive interference pattern 
that is surprising, with rapid variations in the pattern of electron strikes 
over short distances, including regions where almost no electron strikes 
occur. (Compare Fig. 1.4 to Fig. 1.2.) Note also that in the experiment, the 
electrons are widely separated, so that there is never more than one electron 
in the apparatus at any one time. Thus, the electrons cannot interfere with 
one another; rather, each electron interferes with itself. Figure 1.4 shows 
results from the Hitachi experiment, with the number of observed electrons 
increasing from about 150 in the first image to 160,000 in the last image. 


1.3. Schrodinger and Heisenberg 


In 1925, Werner Heisenberg proposed a model of quantum mechanics based 
on treating the position and momentum of the particle as, essentially, 
matrices of size oo x co. Actually, Heisenberg himself was not familiar with 
the theory of matrices, which was not a standard part of the mathematical 
education of physicists at the time. Nevertheless, he had quantities of the 
form xj, and pjx (where j and k each vary over all integers), which we 
can recognize as matrices, as well as expressions such as x Li1Pik, Which 
we can recognize as a matrix product. After Heisenberg explained his the- 
ory to Max Born, Born recognized the connection of Heisenberg’s formulas 
to matrix theory and made the matrix point of view explicit, in a paper 
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coauthored by Born and his assistant, Pascual Jordan. Born, Heisenberg, 
and Jordan then all published a paper together elaborating upon their the- 
ory. The papers of Heisenberg, of Born and Jordan, and of Born, Heisen- 
berg, and Jordan all appeared in 1925. Heisenberg received the 1932 Nobel 
Prize in physics (actually awarded in 1933) for his work. Born’s exclusion 
from this prize was controversial, and may have been influenced by Jordan’s 
connections with the Nazi party in Germany. (Heisenberg’s own work for 
the Nazis during World War II was also a source of much controversy after 
the war.) In any case, Born was awarded the Nobel Prize in physics in 
1954 for his work on the statistical interpretation of quantum mechanics 
(Sect. 1.4). 

Meanwhile, in 1926, Erwin Schrédinger published four remarkable papers 
in which he proposed a wave theory of quantum mechanics, along the lines 
of the de Broglie hypothesis. In these papers, Schrédinger described how the 
waves evolve over time and showed that the energy levels of, for example, 
the hydrogen atom could be understood as eigenvalues of a certain oper- 
ator. (See Chap. 18 for the computation for hydrogen.) Schrédinger also 
showed that the Heisenberg—Born—Jordan matrix model could be incorpo- 
rated into the wave theory, thus showing that the matrix theory and the 
wave theory were equivalent (see Sect. 3.8). This book describes the math- 
ematical structure of quantum mechanics in essentially the form proposed 
by Schrodinger in 1926. Schrodinger shared the 1933 Nobel Prize in physics 
with Paul Dirac. 


1.4 A Matter of Interpretation 


Although Schrédinger’s 1926 papers gave the correct mathematical descrip- 
tion of quantum mechanics (as it is generally accepted today), he did not 
provide a widely accepted interpretation of the theory. That task fell to 
Born, who in a 1926 paper proposed that the “wave function” (as the wave 
appearing in the Schrédinger equation is generally called) should be inter- 
preted statistically, that is, as determining the probabilities for observations 
of the system. Over time, Born’s statistical approach developed into the 
Copenhagen interpretation of quantum mechanics. Under this interpreta- 
tion, the wave function ~ of the system is not directly observable. Rather, 
w merely determines the probability of observing a particular result. 

In particular, if =) is properly normalized, then the quantity |2(a)|? is 
the probability distribution for the position of the particle. Even if wv itself 
is spread out over a large region in space, any measurement of the position 
of the particle will show that the particle is located at a single point, just 
as we see for the electrons in the two-slit experiment in Fig. 1.4. Thus, a 
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measurement of a particle’s position does not show the particle “smeared 
out” over a large region of space, even if the wave function ~ 7s smeared 
out over a large region. 

Consider, for example, how Born’s interpretation of the Schrodinger 
equation would play out in the context of the Hitachi double-slit exper- 
iment depicted in Fig. 1.4. Born would say that each electron has a wave 
function that evolves in time according to the Schrédinger equation (an 
equation of wave type). Each particle’s wave function, then, will propa- 
gate through the slits in a manner similar to that pictured in Fig. 1.1. If 
there is a screen at the bottom of Fig.1.1, then the electron will hit the 
screen at a single point, even though the wave function is very spread out. 
The wave function does not determine where the particle hits the screen; it 
merely determines the probabilities for where the particle hits the screen. If 
a whole sequence of electrons passes through the slits, one after the other, 
over time a probability distribution will emerge, determined by the square 
of the magnitude of the wave function, which is shown in Fig. 1.2. Thus, 
the probability distribution of electrons, as seen from a large number of 
electrons as in Fig. 1.4, shows wavelike interference patterns, even though 
each individual electron strikes the screen at a single point. 

It is essential to the theory that the wave function 7)(x) itself is not the 
probability density for the location of the particle. Rather, the probability 
density is |y(x)|". The difference is crucial, because probability densities 
are intrinsically positive and thus do not exhibit destructive interference. 
The wave function itself, however, is complex-valued, and the real and 
imaginary parts of the wave function take on both positive and negative 
values, which can interfere constructively or destructively. The part of the 
wave function passing through the first slit, for example, can interfere with 
the part of the wave function passing through the second slit. Only after 
this interference has taken place do we take the magnitude squared of the 
wave function to obtain the probability distribution, which will, therefore, 
show the sorts of peaks and valleys we see in Fig. 1.2. 

Born’s introduction of a probabilistic element into the interpretation of 
quantum mechanics was—and to some extent still is—controversial. Ein- 
stein, for example, is often quoted as saying something along the lines of, 
“God does not play at dice with the universe.” Einstein expressed the same 
sentiment in various ways over the years. His earliest known statement to 
this effect was in a letter to Born in December 1926, in which he said, 


Quantum mechanics is certainly imposing. But an inner voice 
tells me that it is not yet the real thing. The theory says a lot, 
but does not really bring us any closer to the secret of the “old 
one.” I, at any rate, am convinced that He does not throw dice. 


Many other physicists and philosophers have questioned the probabilistic 
interpretation of quantum mechanics, and have sought alternatives, such 
as “hidden variable” theories. Nevertheless, the Copenhagen interpretation 
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of quantum mechanics, essentially as proposed by Born in 1926, remains 
the standard one. This book resolutely avoids all controversies surround- 
ing the interpretation of quantum mechanics. Chapter 3, for example, 
presents the standard statistical interpretation of the theory without ques- 
tion. The book may nevertheless be of use to the more philosophically 
minded reader, in that one must learn something of quantum mechanics 
before delving into the (often highly technical) discussions about its inter- 
pretation. 


1.5 Exercises 


1. Beginning with the formula for the sum of a geometric series, use 
differentiation to obtain the identity 


—-A 


= —An _ € 
pe G@—e-4)?" 


n=0 


2. In Planck’s model of blackbody radiation, the energy in a given fre- 
quency w of electromagnetic radiation is distributed randomly over 
all numbers of the form nfiw, where n = 0,1,2,.... Specifically, the 
likelihood of finding energy nhw is postulated to be 


1 
p(E = nhw) = go 
1 


—— 1 — ePhw 


where Z is a normalization constant, which is chosen so that the sum 
over n of the probabilities is 1. Here 6 = 1/(kgT), where T' is the 
temperature and kg is Boltzmann’s constant. The expected value of 
the energy, denoted (F), is defined to be 


(B) = 7 (nue Pam, 


n=0 
(a) Using Exercise 1, show that 


hw 


a a Ee 


(b) Show that (F) behaves like 1/8 = kgT for small w, but that 
(E) decays exponentially as w tends to infinity. 


Note: In applying the above calculation to blackbody radiation, one 
must also take into account the number of modes having frequency 
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in a given range, say between wo and wo + €. The exact number of 
such frequencies depends on the shape of the cavity, but according to 
Weyl’s law, this number will be approximately proportional to ewé for 
large values of wo. Thus, the amount of energy per unit of frequency is 


hw? 


ea (1.7) 


where C’ is a constant involving the volume of the cavity and the 
speed of light. The relation (1.7) is known as Planck’s law. 


. In classical mechanics, the kinetic energy of an electron is m_v?/2, 
where v is the magnitude of the velocity. Meanwhile, the potential 
energy associated with the force law (1.3) is V(r) = —Q?/r, since 
dV/dr = F. Show that if the particle is moving in a circular orbit 
with radius r, given by (1.5), then the total energy (kinetic plus 
potential) of the particle is F,,, as given in (1.1). 


2 


A First Approach to Classical 
Mechanics 


2.1 Motion in R! 


2.1.1 Newton’s law 


We begin by considering the motion of a single particle in R', which may 
be thought of as a particle sliding along a wire, or a particle with motion 
that just happens to lie in a line. We let x(t) denote the particle’s position 
as a function of time. The particle’s velocity is then 


u(t) := a(t), 


where we use a dot over a symbol to denote the derivative of that quantity 
with respect to the time t. 
The particle’s acceleration is then 


where % denotes the second derivative of x with respect to t. We assume 
that there is a force acting on the particle and we assume at first that the 
force F' is a function of the particle’s position only. (Later, we will look at 
the case of forces that depend also on velocity.) 

Under these assumptions, Newton’s second law (F' = ma) takes the form 


F(a(t)) =ma = mi(t), (2.1) 


where m is the mass of the particle, which is assumed to be positive. We will 
henceforth abbreviate Newton’s second law as simply “Newton’s law,” since 
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we will use the second law much more frequently than the others. Since 
(2.1) is of second order, the appropriate initial conditions (needed to get 
a unique solution) are the position and velocity at some initial time tp. So 
we look for solutions of (2.1) subject to 


x(to) = Zo 
&(to) = U0- 


Assuming that F’ is a smooth function, standard results from the ele- 
mentary theory of differential equations tell us that there exists a unique 
local solution to (2.1) for each pair of initial conditions. (A local solution 
is one defined for t in a neighborhood of the initial time to.) Since (2.1) is 
in general a nonlinear equation, one cannot expect that, for a general force 
function F, the solutions will exist for all ¢. If, for example, F(a) = x”, then 
any solution with positive initial position and positive initial velocity will 
escape to infinity in finite time. (Apply Exercise 4 with V(x) = —2x?/3.) 
For a proof existence and uniqueness, see Example 8.2 and Theorem 8.13 
in [28]. 


Definition 2.1 A solution x(t) to Newton’s law is called a trajectory. 


Example 2.2 (Harmonic Oscillator) /f the force is given by Hooke’s 
law, F(a) = —ka, where k is a positive constant, then Newton’s law can be 
written as mi + kx = 0. The general solution of this equation is 


x(t) = acos(wt) + bsin(wt), 
where Ww i= Jk/m is the frequency of oscillation. 


The system in Example 2.2 is referred to as a (classical) harmonic os- 
cillator. This system can describe a mass on a spring, where the force is 
proportional to the distance x that the spring is stretched from its equi- 
librium position. The minus sign in —kz indicates that the force pulls the 
oscillator back toward equilibrium. Here and elsewhere in the book, we 
use the “angular” notion of frequency, which is the rate of change of the 
argument of a sine or cosine function. If w is the angular frequency, then 
the “ordinary” frequency—i.e., the number of cycles per unit of time—is 
w/2r. Saying that has (angular) frequency w means that x is periodic 
with period 27/w. 


2.1.2 Conservation of Energy 


We return now to the case of a general force function F(x). We define 
the kinetic energy of the system to be smu. We also define the potential 
energy of the system as the function 


Viey== / PGi de (2.2) 
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so that F(a) = —dV/dz. (The potential energy is defined only up to adding 
a constant.) The total energy E of the system is then 


Fla,v) = sme $V). (2.3) 


The chief significance of the energy function is that it is conserved, meaning 
that its value along any trajectory is constant. 


Theorem 2.3 Suppose a particle satisfies Newton’s law in the form mi = 
F(a). Let V and E be as in (2.2) and (2.8). Then the energy E is conserved, 
meaning that for each solution x(t) of Newton’s law, E(a(t),@(t)) is inde- 
pendent of t. 


Proof. We verify this by differentiation, using the chain rule: 


Blot), (4) = 5 (Gmcaco)? + V(e()) 
dv 


= mé(t)é(t) + a(t) 
= #(t)[ma(t) — F((2)). 


This last expression is zero by Newton’s law. Thus, the time-derivative of 
the energy along any trajectory is zero, so E(«(t),#(t)) is independent of 
t, as claimed. m 

We may call the energy a conserved quantity (or constant of motion), 
since the particle neither gains nor loses energy as the particle moves 
according to Newton’s law. 

Let us see how conservation of energy helps us understand the solution 
to Newton’s law. We may reduce the second-order equation mi = F(a) to 
a pair of first-order equations, simply by introducing the velocity v as a 
new variable. That is, we look for pairs of functions (x(t), v(t)) that satisfy 
the following system of equations 


dx 
ao 
dv 1 


If (a(t), v(t)) is a solution to this system, then we can immediately see that 
x(t) satisfies Newton’s law, just by substituting dx/dt for v in the second 
equation. We refer to the set of possible pairs of the form (z, v) (i-e., R?) 
as the phase space of the particle in R!. The appropriate initial conditions 
for this first-order system are x(0) = 2 and v(0) = vo. 

Once we are working in phase space, we can use the conservation of 
energy to help us. Conservation of energy means that each solution to 
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the system (2.4) must lie entirely on a single “level curve” of the energy 
function, that is, the set 


{ (x,v) € R?| E(a,v) = E(20,v0)}. (2.5) 


If F—and therefore also V—is smooth, then FE is a smooth function of x 
and v. Then as long as (2.5) contains no critical points of E, this set will 
be a smooth curve in R?, by the implicit function theorem. If the level set 
(2.5) is also a simple closed curve, then the solutions of (2.5) will simply 
wind around and around this curve. Thus, the set that the solutions to (2.5) 
trace out in phase space can be determined simply from the conservation 
of energy. The only thing not apparent at the moment is how this curve is 
parameterized as a function of time. 

In mechanics, a conserved quantity—such as the energy in the one- 
dimensional version of Newton’s law—is often referred to as an “integral 
of motion.” The reason for this is that although Newton’s second law is a 
second-order equation in x, the energy depends only on x and « and not 


on &. Thus, the equation 
mm). 
3 (#0)? + Ve) = Bo, 


where Eo is the value of the energy at time to, is actually a first-order 
differential equation. We can solve for « to put this equation into a more 


standard form: 


m 








x(t) = 


What this means is that by using conservation of energy we have turned the 
original second-order equation into a first-order equation. We have therefore 
“integrated” the original equation once, that is, changed an equation of 
the form #(t) = --- into an equation of the form «(t) = --- . The first- 
order equation (2.6) is separable and can be solved more-or-less explicitly 
(Exercise 1). 


2.1.3 Systems with Damping 


Up to now, we have considered forces that depend only on position. It is 
common, however, to consider forces that depend on the velocity as well 
as the position. In the case of a damped harmonic oscillator, for example, 
one typically assumes that there is, in addition to the force of the spring, 
a damping force (friction, say) that is proportional to the velocity. Thus, 
F = —kz — yz, where k is, as before, the spring constant and where y > 0 
is the damping constant. The minus sign in front of y«# reflects that the 
damping force operates in the opposite direction to the velocity, causing 
the particle to slow down. The equation of motion for such a system is then 


mei + ye + kx = 0. 
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If y is small, the solutions to this equation display decaying oscillation, 
meaning sines and cosines multiplied by a decaying exponential; if y is 
large, the solutions are pure decaying exponentials (Exercise 5). 

In the case of the damped harmonic oscillator, there is no longer a 
conserved energy. Specifically, there is no nonconstant continuous func- 
tion E on R? such that E(a(t),<(t)) is independent of ¢ for all solutions of 
Newton’s law. To see this, we simply observe that for y > 0, all solutions 
x(t) have the property that (#(t), #(t)) tends to the origin in the plane as t 
tends to infinity. Thus, if EF is continuous and constant along each trajec- 
tory, the value of F at the starting point has to be the same as the value 
at the origin. 

We now consider a general system with damping. 


Proposition 2.4 Suppose a particle moves in the presence of a force law 
given by F(a,4) = F(x) — ya, with y > 0. Define the energy E of the 
system by 


1 
E(x,2) = aint +V(z), 
where dV/dx = —F,(x). Then along any trajectory x(t), we have 


£ B(a(t), a(t) = ~r#(t)? <0. 


Thus, although the energy is not conserved, it is decreasing with time, 
which gives us some information about the behavior of the system. 
Proof. We differentiate as in the proof of Theorem 2.3, except that now 
dV /dx = —F (x): 

d : : 
get), £@) = #(t)[ma(t) — Fi(a(é))]- 
Since F is not the full force function, the quantity in square brackets equals 
not zero but —yz. Thus, dE /dt = —yz?. = 

We can interpret Proposition 2.4 as saying that in the presence of friction, 
the system we are studying gives up some of its energy to heat energy in 
the environment, so that the energy of our system decreases with time. 
We will see that in higher dimensions, it is possible to have conservation 
of energy in the presence of velocity-dependent forces, provided that these 
forces act perpendicularly to the velocity. 


2.2. Motion in R” 


We now consider a particle moving in R”. The position x = (21,...,2n) 
of a particle is now a vector in R”, as is the velocity v and acceleration a. 
We let 


X= (é1,...,%n) 
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denote the derivative of x with respect to t and we let X denote the second 
derivative of x with respect to t. Newton’s law now takes the form 


mxX(t) = F(x(t), X(t), (2.7) 


where F : R” x R” — R” is some force law, which in general may depend 
on both the position and velocity of the particle. 

We begin by considering forces that are independent of velocity, and we 
look for a conserved energy function in this setting. 


Proposition 2.5 Consider Newton’s law (2.7) in the case of a velocity- 
independent force: mX(t) = F(x(t)). Then an energy function of the form 


Fei ™ Ix[? + V (x0) 
is conserved if and only if V satisfies 
-VV =F, 
where VV is the gradient of V. 
Saying that E is “conserved” means that E(x(t),x(t)) is independent of 
t for each solution x(t) of Newton’s law. The function V is the potential 


energy of the system. 
Proof. Differentiating gives 


& (gmikOl +VE)) =m > a(DH) +O Fast) 


= X(t) - [m&(t) + VV] 
X(t) - [F(x) + VV(x)] 


Thus, dE/dt will always be equal to zero if and only if we have 
—VV (x) = F(x) 


for all x. 

We now encounter something that did not occur in the one-dimensional 
case. In R!, any smooth function can be expressed as the derivative of some 
other function. In R”, however, not every vector-valued function F(x) can 
be expressed as the (negative of) the gradient of some scalar-valued function 
V. 


Definition 2.6 Suppose F is a smooth, R”-valued function on a domain 
U CR". Then F is called conservative if there exists a smooth, real-valued 
function V on U such that F = —VV. 


If the domain U is simply connected, then there is a simple local condition 
that characterizes conservative functions. 
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Proposition 2.7 Suppose U is a simply connected domain in R” and F 
is a smooth, R"-valued function on U. Then F is conservative if and only 
if F satisfies 

ood 0 2.8 


at each point in U. 


When n = 3, it is easy to check that the condition (2.8) is equivalent 
to the curl V x F of F being zero on U. The hypothesis that U be simply 
connected cannot be omitted; see Exercise 7. 

Proof. If F is conservative, then 
OF; eV OV OF ;, 


Ox, OL,OX ; 7 Ox; OX p 7 Ox; 








at every point in U. In the other direction, if F satisfies (2.8), V can be 
obtained by integrating F along paths and using the Stokes theorem to 
establish independence of choice of path. See, for example, Theorem 4.3 on 
p. 549 of [44] for a proof in the n = 3 case. The proof in higher dimensions 
is the same, provided one knows the general version of the Stokes theorem. 
rT] 

We may also consider velocity-dependent forces. If, for example, F(x, v) 
= —yv + Fi(x), where ¥ is a positive constant, then we will again have 
energy that is decreasing with time. There is another new phenomenon, 
however, in dimension greater than 1, namely the possibility of having a 
conserved energy even when the force depends on velocity. 


Proposition 2.8 Suppose a particle in R” moves in the presence of a force 
F of the form 
F(x, v) = —VV(x) + Fa(x, v), 


where V is a smooth function and where F2 satisfies 
v- Fo(x,v) = 0 (2.9) 


for all x and v in R". Then the energy function E(x,v) = 4m |v)? + V(x) 
is constant along each trajectory. 


If, for example, Fz is the force exerted on a charged particle in R? by a 
magnetic field B(x), then 


F2(x, v) = qv x B(x), 


where q is the charge of the particle, which clearly satisfies (2.9). 
Proof. See Exercise 8. 
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If we have a system if N particles, each moving in R”, then we denote the 
position of the jth particle by 
x) = (ai,..., a4). 
Thus, in the expression a, the superscript 7 indicates the 7th particle, while 
the subscript k indicates the kth component. Newton’s law then takes the 
form 
ee SB ing ey FHL Ban 

where m, is the mass of the jth particle. Here, F’ is the force on the jth 
particle, which in general will depend on the position and velocity not only 
of that particle, but also on the position and velocity of the other particles. 


2.8.1 Conservation of Energy 


In a system of particles, we cannot expect that the energy of each individ- 
ual particle will be conserved, because as the particles interact, they can 
exchange energy. Rather, we should expect that, under suitable assump- 
tions on the forces F/, we can define a conserved energy function for the 
whole system (the total energy of the system). 

Let us consider forces depending only on the position of the particles, 
and let us assume that the energy function will be of the form 


N 
1 
E(x!,...,x%,v1,...,v¥) = > 5m Jv] +V(xt,...,%%). (2:10) 
=i 
We will now try to see what form for V (if any) will allow E to be constant 
along each trajectory. 


Proposition 2.9 An energy function of the form (2.10) is constant along 
each trajectory if 

VV = -F! (2.11) 
for each j, where V? is the gradient with respect to the variable x). 


Proof. We compute that 


dE 


ae [mj x! X72 + VIV - 
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x. [mk + VIV] 
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x - [FI + VIV]. 
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If ViV = —F!, then E will be conserved. 
As in the one-particle case, there is a simple condition for the existence 
of a potential function V satisfying (2.11). 


Proposition 2.10 Suppose a force function F = (F',...,F%) is defined 
on a simply connected domain U in R™. Then there exists a smooth 
function V on U satisfying 


ViV =-F! 
for all j if and only if we have 


OF)  OFt, 








(242) 


for all j, k, l, and m. 


Proof. Apply Proposition 2.7 with n replaced by nN and with j and k 
replaced by the pairs (j,k) and (l,m). @ 


2.8.2 Conservation of Momentum 


We now introduce the notion of the momentum of a particle. 


Definition 2.11 In an N-particle system, the momentum of the jth 
particle, denoted p), is the product of the mass and the velocity of that 
particle: 

p’ = mx. 


The total momentum of the system, denoted p, is defined as 


N 
p=) Pp’ 
j=l 
Observe that 
dp! wet rt 
ae => m= F’. 


Thus, Newton’s law may be reformulated as saying, “The force is the rate 
of change of the momentum.” This is how Newton originally formulated 
his second law. 

Newton’s third law says, “For every action, there is an equal and opposite 
reaction.” This law will apply if all forces are of the “two-particle” variety 
and satisfy a natural symmetry property. Having two-particle forces means 
that the force F/ on the jth particle is a sum of terms F’*, 7 4 k, where 
F/" depends only x’ and x’. The relevant symmetry property is that 
Fi‘ (xJ,x*) = —F*J(x*, x): that is, the force exerted by the jth particle 
on the kth particle is the negative (i.e., “equal and opposite”) of the force 
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exerted by the kth particle on the jth particle. If the forces are assumed 
also to be conservative, then the potential energy of the system will be of 
the form 
Vie gt = OS ee), (2.13) 
j<k 
One important consequence of Newton’s third law is conservation of the 
total momentum of the system. 


Proposition 2.12 Suppose that for each j, the force on the jth particle is 
of the form 
Bi Se sag = ye 
kAj 
for certain functions F7*. Suppose also that we have the “equal and 
opposite” condition 


FI" (x3 x*) = — FRI (x! x*), 
Then the total momentum of the system is conserved. 


Note that since the rate of change of p’ is F’, the force on the jth 
particle, the momentum of each individual particle is not constant in time, 
except in the trivial case of a noninteracting system (one in which all forces 
are zero). 

Proof. Differentiating gives 


By the equal and opposite condition, F/* (x, x*) cancels with F*4 (x’,x*), 
so dp/dt=0. m 

Let us consider, now, a more general situation in which we have con- 
servative forces, but not necessarily of the “two-particle” form. It is still 
possible to have conservation of momentum, as the following result shows. 


Proposition 2.13 If a multiparticle system has a force law coming from 
a potential V, then the total momentum of the system is conserved if and 
only if 

V(x! +a,x* +a,...,x" +a) =V(x',x’,...,x7) (2.14) 
for alla € R”. 


Proof. Apply (2.14) with a = te;, where e, is the vector with a 1 in the 
kth spot and zeros elsewhere. Differentiating with respect to t at t = 0 
gives 
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where px is the kth component of the total momentum p. Thus, if (2.14) 
holds, p is constant in time. 

Conversely, if the momentum is conserved, then the sum of the forces is 
zero at every point, and so 


d 
cr (x! + ta,x? + ta,...,x% + ta) 
N 
=) °V!V(x! +ta,x? +ta,...,.x +ta)-a 
j=l 


N 
=— S > F(x! + ta,x? + ta,...,x% + ta) -a 
j=l 


=0 


for all t. Thus, the value of the quantity being differentiated is the same at 
t = 0 as at t = 1, which establishes (2.14). m 

The moral of the story is that conservation of momentum is a consequence 
of translation-invariance of the system, where “translation invariance ” 
means invariance under simultaneous translations of every particle by the 
same amount. (See Exercise 11 for a more general version of this result.) 
If the potential is of the “two-particle” form (2.13), then it is evident that 
the condition (2.14) is satisfied. 


2.8.8 Center of Mass 


We now consider an important application of momentum conservation. 


Definition 2.14 For a system of N particles moving in R”, the center 
of mass of the system at a fixed time is the vector c € R” given by 


where M = er m, 1s the total mass of the system. 


The center of mass is a weighted average of the positions of the various 
particles. Differentiating c(t) with respect to t gives 


1 
dt M 


de p 
= mgt = 2.15 
j=l . M’ 


where p is the total momentum. 
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Proposition 2.15 Suppose the total momentum p of a system is conserved. 
Then the center of mass moves in a straight line at constant speed. 


Specifically, 


e(t) = eto) + (t to), 


where c(to) is the center of mass at some initial time to. 


Proof. The result follows easily from (2.15). ™ 

The notion of center of mass is particularly useful in a system of two 
particles in which momentum is conserved. For a system of two particles, if 
the potential energy V(x!, x?) is invariant under simultaneous translations 
of x! and x”, then it is of the form 


V(x!,x’) = V(x! — x”), 


where V(a) = V(a,0). 
Now, the positions x!, x? of the particles can be recovered from knowledge 
of the center of mass and the relative position 


y:=x'-x? 





as follows: 
1 c+ m2y 
x = 
my, +mMe 
2 c— my 
Se 
My + m2 
Meanwhile, we may compute that 
1 2 sa 2 Desi 2 
y(t) =X —X° = —-—VV(x — x’) VV(x — x") 
My m2 


This calculation gives the following result. 


Proposition 2.16 For a two-particle system with potential energy of the 
form V(x!,x?) = V(x! — x”), the relative position y := x! — x? satisfies 
the differential equation 


where 4 is the reduced mass given by 





Thus, when the total momentum of a two-particle system is conserved, 
the relative position evolves as a one-particle system with “effective” mass 1, 
while the center of mass moves “trivially,” as described in Proposition 2.15. 
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FIGURE 2.1. A(¢) is the area of the shaded region. 





2.4 Angular Momentum 


We start by considering angular momentum in the simplest nontrivial case, 
motion in R?. 


Definition 2.17 Consider a particle moving in R?, having position x, 
velocity v, and momentum p = mv. Then the angular momentum of 
the particle, denoted J, is given by 


J = £1p2 — 2p1. (2.16) 

In more geometric terms, J = |x| |p| sin ¢, where ¢ is the angle (measured 
counterclockwise) between x and p. We can look at J in yet another way 
as follows. If @ is the usual angle in polar coordinates on R?, then an 


elementary calculation (Exercise 9) shows that 


» do 


J= —. 2.17 
mr 7 ( ) 
It then follows that 
dA 
J = 2m— 2.18 
ae? Gas) 


where A = (1/2) fr? d@ is the area being swept out by the curve x(t). 
See Fig. 2.1. 

One significant property of the angular momentum is that it (like the 
energy) is conserved in certain situations. 


Proposition 2.18 Suppose a particle of mass m is moving in R? under 
the influence of a conservative force with the potential function V(x). If 
V is invariant under rotations in R?, then the angular momentum J = 
L1p2—X2p1 1s independent of time along any solution of Newton’s equation. 
Conversely, if J is independent of time along every solution of Newton’s 
equation, then V is invariant under rotations. 
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Proof. Differentiating (2.16) along a solution of Newton’s law gives 








dJ dx, dp2 dxy dp, 
a en ee a 
1 O 1 OV 
Pip2 T15 pepi 4 T25 
_ OV OV 
7 2 Ox1 et ss 


On the other hand, consider rotations Rg in R? given by 


cos@ —sin@ 
a ( sin 6 cos 6 ). 


If we differentiate V along this family of rotations, we obtain 


d av de OV a) wv. wv ds 
ap ON) Gade By Oa ag 








Ox dt ) , 


Thus, the angular derivative of V is zero if and only if J is constant. 
Conservation of J [together with the relation (2.18)] gives the following 
result. 


Corollary 2.19 (Kepler’s Second Law) Suppose a particle is moving 
in IR? in the presence of a force associated with a rotationally invariant 
potential. If x(t) is the trajectory of the particle, then the area swept out by 
x(t) between times t = a andt = b is (b—a)J/(2m), where J is the constant 
value of the angular momentum along the trajectory. Since the area swept 
out depends only on b—a, we may say that “equal areas are swept out in 
equal times.” 


Kepler, of course, was interested in the motion of planets in R?, not in 
R?. The motion of a planet moving in the “inverse square” force of a sun 
will, however, always lie in a plane. (This claim follows from the three- 
dimensional version of conservation of angular momentum, as explained in 
Sect. 2.6.1.) 

In R?, the angular momentum of the particle is a vector, given by 


J=xxp, (2.19) 
where x denotes the cross product (or vector product). Thus, for example, 
J3 = %1p2 — ©2p1. (2.20) 


If, then, we have a particle in R® that just happens to be moving in R? 
(i.e., 23 = 0 and p3 = 0), then the angular momentum will be in the z- 
direction with z-component given by the quantity J defined in 
Definition 2.17. 
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The representation of the angular momentum of a particle in R? as a 
vector is a low-dimensional peculiarity. For a particle in R”, the angular 
momentum is a skew-symmetric matrix given by 


Tike = LjPk = LkYpj- (2.21) 


In the R® case, the entries of the 3 x 3 angular momentum matrix are made 
up by the three components of the angular momentum vector together with 
their negatives, with zeros along the diagonal. [Compare, e.g., (2.20) and 
(2.21).] 


Definition 2.20 For a system of N particles moving in R”, the total 
angular momentum. of the system is the skew-symmetric matriz J given 


by 


N 
Jin = > (xi p}, — cip}) . (2.22) 


Theorem 2.21 Suppose a system of N particles in R” is moving under 
the influence of conservative forces with potential function V. If V_ satisfies 


Vito Ret. RR) = Vie x”) (2.23) 


for every rotation matriz R, then the total angular momentum of the system 
is conserved (constant along each trajectory). Conversely, if the total an- 
gular momentum is constant along each trajectory, then V satisfies (2.23). 


The proof of this result is similar to that of Proposition 2.18 and is left 
as an exercise (Exercise 10). We will re-examine the concept of angular 
momentum in the next section using the language of Poisson brackets and 
Hamiltonian flows. 


2.5 Poisson Brackets and Hamiltonian Mechanics 


We consider now the Hamiltonian approach to classical mechanics. (There 
is also the Lagrangian approach, but that approach is not as relevant for 
our purposes.) The Hamiltonian approach, and in particular the Poisson 
bracket, will help us to understand the general phenomenon of conserved 
quantities. The Poisson bracket is also an important source of motivation 
for the use of commutators in quantum mechanics. 

In the Hamiltonian approach to mechanics, we think of the energy func- 
tion as a function of position and momentum, rather than position and 
velocity, and we refer to it as the “Hamiltonian.” If a particle in R” has 
the usual sort of energy function (kinetic energy plus potential energy), we 
have 


H(x,p) = = DLP; + V(x). (2.24) 
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Here, as usual, pj = m;z;. We now observe that Newton’s law can be 
expressed in the following form: 


Ci. oe 

dt a Op; 

dp; OH 

+ = -—. 2.25 
dt Ox; ( ) 


After all, with H of the indicated form, these equations read dz;/dt = 
p;/m, which is just the definition of p;, and dp; /dt = —OV/0x,; = Fj, which 
is just Newton’s law, in the form originally given by Newton. We refer to 
Newton’s law, in the form (2.25) as Hamilton’s equations. 

Although it is not obvious at the moment that we have gained anything 
by writing Newton’s law in the form (2.25), let us proceed on a bit further 
and see. Our next step is to introduce the Poisson bracket. 


Definition 2.22 Let f and g be two smooth functions on R?", where an 
element of R2" is thought of as a pair (x,p), with x € R” representing the 
position of a particle and p € R” representing the momentum of a particle. 
Then the Poisson bracket of f and g, denoted {f,g}, is the function on 
R2” given by 





ta} op) => (2 Og at 2E 2). 


j=l Ox; Op; Op; Ox; 


The Poisson bracket has the following properties. 


Proposition 2.23 For all smooth functions f, g, and h on R?” we have 
the following: 


1. {f,gtch} ={f,g}+cff,h} for alc eR 
2. {o,f} =—{f, 9} 

3. {f,gh} ={f,g}h+ off, h} 

4 {feito hh = {Fg}, Ab t+ to, tf htt 


Properties 1 and 2 of Proposition 2.23 say that the Poisson bracket is 
bilinear and skew-symmetric. Property 3 says that the operation of “bracket 
with f” satisfies the derivation property (similar to the product rule for 
derivatives) with respect to pointwise multiplication of functions, while 
Property 4 says that “bracket with f” satisfies the derivation property 
with respect to the Poisson bracket itself. Property 4 is equivalent to the 
Jacobi identity: 


{fitg htt + th. tf gb} + tg, th, hh = 0, (2.26) 
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as may easily be seen using the skew-symmetry of the Poisson bracket. 
The Jacobi identity, along with bilinearity and skew-symmetry, means that 
the space of C® functions on R?” forms a Lie algebra under the operation 
of a Poisson bracket. (See Chap. 16.) 

Proof. The first two properties of the Poisson bracket are obvious and the 
third is an easy consequence of the product rule. Let us think about what 
goes into proving Property 4 by direct computation. (An alternative proof 
is given in Exercise 15.) We compute that 


“. Of 9d (2 Oh _ 99 =) 
’ Jf ~~ 
{f.{9,h}} » Bx; Opy \x; Op; Op; Oe 








“Of 0 (0g Oh Ag Oh 
Op; Ox; Ox, Op; Op; Ox; , 


j=1 


Just the first term in the expression for {f, {g,h}} generates the following 
four terms (all summed over j) after we use the product rule: 


Of Og ah . Of dg Bh Of Bq ah Of Og Oh 
Ox; OxjOp; Opy Ox; Ox; Op? Ax; Op? Oxy Ox; Op; OxjOp; 


We see, then, that the left-hand side of (2.26) will have a total of 24 terms, 
each summed over 7. Each term will have a single derivative on two of the 
three functions, and two derivatives on the third function. There are three 
possibilities for which function gets two derivatives. Once that function is 
chosen, there are four possibilities for which derivatives go on the other 
two functions, with the function that gets two derivatives getting whatever 
derivatives remain (for a total of two 2-derivatives and two p-derivatives). 
That makes 12 possible terms. It is a tedious but straightforward exercise 
to check that each of these 12 possible terms occurs twice in the left-hand 
side of (2.26), with opposite signs. To check just one case explicitly, in 
computing {h,{f,g}}, we will get a term like the second term in (2.27), 
but with (f,g,h) replaced by (h, f,g): 


ah af ag 

Ox; Ox; Ops 
This term (in the computation of {h, {f,g}}) cancels with the third term 
in (2.27) (in the computation of {f,{g,h}}). m 


The following elementary result will provide a helpful analogy to the 
“canonical commutation relations” in quantum mechanics. 


(2.27) 





Proposition 2.24 The position and momentum functions satisfy the fol- 
lowing Poisson bracket relations: 


{xj, rn} =0 
{p;, 9%} = 0 
{2j, Dk} = djx- 
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Proof. Direct calculation. m 
One of the main reasons for considering the Poisson bracket is the 
following simple result. 


Proposition 2.25 If (x(t),p(t)) is a solution to Hamilton’s equation 
(2.25), then for any smooth function f on R?” we have 


© F(x(t), p(t) = {f, 1} (x(), pt). 


We generally write Proposition 2.25 in a more concise form as 


df 
a lS, 


where the time derivative is understood as being along some trajectory. 
Proof. Using the chain rule and Hamilton’s equations, we have 


df _ 9 Of dx; . Of dp; 
dt : Ox; dt Op; dt 
j=l 


_yo ( of oH ot ( a) 
y (= Op; Op; Ox; 











as claimed. 

Observe that Proposition 2.25 includes Hamilton’s equations themselves 
as special cases, by taking f(x,p) = x; and by taking f(x,p) = p;. Thus, 
this proposition gives a more coordinate-independent way of expressing the 
time-evolution. 


Corollary 2.26 Call a smooth function f on R2” a conserved quantity if 
f (x(t), p(t)) is independent of t for each solution (x(t), p(t)) of Hamilton’s 
equations. Then f is a conserved quantity if and only if 


{f, H} = 0. 
In particular, the Hamiltonian H is a conserved quantity. 


Conserved quantities are also called constants of motion. See Conclusion 
2.31 for another perspective on this result. Conserved quantities (when one 
can find them) are useful in that we know that trajectories must lie in 
the level surfaces of any conserved quantity. Suppose, for example, that 
we have a particle moving in R? and that the Hamiltonian H and one 
other independent function f (such as, say, the angular momentum) are 
conserved quantities. Then, rather than looking for trajectories in the four- 
dimensional phase space, we look for them inside the joint level sets of H 
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and f (sets of the form H(a,p) = a, f(a,p) = 6, for some constants a 
and 6b). These joint level sets are (generically) two-dimensional instead of 
four-dimensional, so using the constants of motion greatly simplifies the 
problem—from an equation in four variables to one in only two variables. 
Solving Hamilton’s equations on R?” gives rise to a flow on R?”, that is, a 
family ©, of diffeomorphisms of R?”, where ®;(x, p) is equal to the solution 
at time t of Hamilton’s equations with initial condition (x, p). Since it is 
possible (depending on the choice of potential function V) that a particle 
can escape to infinity in finite time, the maps ®, are not necessarily defined 
on all of R?”, but only on some open subset thereof. If ®; does happen to 
be defined on all of R?” (for all t), then we say that the flow is complete. 


Theorem 2.27 (Liouville’s Theorem) The flow associated with Hamil- 
ton’s equations, for an arbitrary Hamiltonian function H, preserves the 
(2n)-dimensional volume measure 


dx ,dx2---dx,dpidp2:--dpn. 


What this means, more precisely, is that if a measurable set FE is con- 
tained in the domain of ®; for some ¢t € R, then the volume of ®,(F) is 
equal to the volume of EF. 

Proof. Hamilton’s equations may be written as 





OH 
pal Opi 
d | 2p gir 
— -_ mol, 2.28 
dt | pi _ git ( ) 
Pn at 


This means that Hamilton’s Equations describe the flow along the vector 
field on R?” appearing on the right-hand side of (2.28). By a standard result 
from vector calculus (see, e.g., Proposition 16.33 in [29]), this flow will be 
volume-preserving if and only if the divergence of the vector field is zero. 
We compute this divergence as 


0 OF __O OH O OH O OH 
Ox 1 Op, OL ODn Op, Ox 1 ODn OXn ; 





(2.29) 


Since 
O?H OPH 


Ox; 0p; = OpjOx; A 





the divergence is zero. @ 

The existence of an invariant volume has important consequences for 
the dynamics of a system. For example, for “confined” systems, an invari- 
ant volume implies that the system exhibits “recurrence,” which means 
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(roughly) that for most initial conditions, the particle will eventually come 
back arbitrarily close to its initial state in phase space. We will not, how- 
ever, delve into this aspect of the theory. 

Note that the divergence of X7, computed in (2.29), vanishes in a very 
particular way, namely the sum of the jth and (n + j)th terms vanishes 
for all 1 < 7 < n. This stronger condition turns out to be equivalent to 
the condition that the Hamiltonian flow ®; associated with an arbitrary 
smooth function on R?” preserves the symplectic form w, defined by 


w((x,p), (x’,p’)) =x-p!—p-x’. 


What this means, more precisely, is that for any t € R and any (x, p) € R?”, 
the matrix of partial derivatives of ®, at the point (x, p)—thought of as a 
linear map of R?” to R?”—preserves w. This property of ®;, as it turns out, 
is equivalent to the property that ©; preserves Poisson brackets, meaning 
that 

{fo®:,goP}={fg} oO 


for all f,g € C®~(R"). A map VU : R?” — R?”" that preserves w is called 
a symplectomorphism (in mathematics notation) or a canonical transfor- 
mation (in physics notation). We defer the proofs of these claims until 
Chap. 21, where we can consider them in a more general setting. 


Definition 2.28 For any smooth function f on R?", the Hamiltonian 
flow generated by f is the flow obtained by solving Hamilton’s equation (2.25) 
with the Hamiltonian H replaced by f. The function f is called the Hamil- 
tonian generator of the associated flow. 


Although any smooth function on R?” can be inserted into Hamilton’s 
equations to produce a flow, physically one should think that there is a 
distinguished function, the Hamiltonian H of the system, such that the 
flow generated by H is the time-evolution of the system. For any other 
function f, the Hamiltonian flow generated by f should not be thought 
of as time-evolution, but as some other flow, which might, for example, 
represent some family of symmetries of our system. 


Proposition 2.29 The Hamiltonian flow generated by the function 


fa(x,p) :=a-p (2.30) 
is given by 

x(t) =x o+ta 

P(t) = Po, (2.31) 


and the Hamiltonian flow generated by the function 


9o(X,p) = b-x (2.32) 
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is given by 


x(t) = xo 
P(t) = po — tb. 


Proof. Direct calculation. 

What this means is that the Hamiltonian flow generated by a linear 
combination of the momentum functions consists of translations in position 
of the particle. That is to say, in the flow (2.31) generated by the function 
fa in (2.30), the particle’s initial position xo is translated by ta while the 
particle’s momentum is independent of t. Similarly, the Hamiltonian flow 
generated by a linear combination of the position functions [the function 
gb in (2.32)] consists of translations in the particle’s momentum. 


Proposition 2.30 For a particle moving in R?, the Hamiltonian flow gen- 
erated by the angular momentum function 


I(x, Pp) = X1p2 — L2P1 


consists of simultaneous rotations of x and p. That is to say, 


ap ea 
[me = [Sane ort [La | 289 


Proof. If we plug the angular momentum function J into Hamilton’s equa- 
tions in place of H, we obtain 





Ot ees 
di Opn” di Ox © 
dey OJ dpp Od 
dt “je dt a Bis 


The solution to this system is given by the expression in the proposition, 
as is easily verified by differentiation of (2.33). ™ 

Note that since the Hamiltonian flow generated by J does not have the 
interpretation of the time-evolution of the particle, the parameter t in (2.33) 
should not be interpreted as the physical time; it is just the parameter in a 
one-parameter group of diffeomorphisms. In this case, t is the angle of rota- 
tion. Thus, one answer to the question, “What is the angular momentum?” 
is that J is the Hamiltonian generator of rotations. 

If f is any smooth function, then by the proof of Proposition 2.25, the 
time derivative of any other function g along the Hamiltonian flow gener- 
ated by f is given by dg/dt = {g, f}. In particular, the derivative of the 
Hamiltonian H along the flow generated by f is {H, f}. Thus, f is constant 


40 2. A First Approach to Classical Mechanics 


along the flow generated by H if and only if {f, H} = 0, which holds if and 
only if {f, 1} = 0, which holds if and only if H is constant along the flow 
generated by f. This line of reasoning leads to the following result. 


Conclusion 2.31 A function f is a conserved quantity for solutions of 
Hamilton’s equation (2.25) if and only if H is invariant under the Hamil- 
tonian flow generated by f. In particular, the angular momentum J is con- 
served if and only if H is invariant under simultaneous rotations of x and p. 


We will return to this way of thinking about conserved quantities in 
Chap. 21. Compare Exercise 12. 

The Hamiltonian framework can be extended in a straightforward way 
to systems of particles. 


Proposition 2.32 Consider the phase space for a system of N particles 
moving in IR”, namely R?"%, thought of as the set of (2N)-tuples of the 


form 
CS ei A ee) 


with x) and p! belonging to R". Define the Poisson bracket of two smooth 
functions f and g on the phase space by 


N on 
sone SoS (mah a) 


=a Ox}, Opi, Op}, Ox? 





and consider a Hamiltonian function of the form 


N 
1 “2 
Be ek oP ei SS [EV ie 
ts ei i) 2 Fi; | PV yak) 
Then Newton’s law in the form m;X! = —V/V is equivalent to Hamilton’s 
equations in the form 
da _ OH 
dt Op}, 
dp} H 
ee, (2.34) 
dt Ox}, 


For any smooth function f, the derivative of f along a solution of Hamil- 
ton’s equations is given by 


df 
—={f, HA}. 
= = (fH 

The proof of these results is entirely similar to the one-particle case and 
is omitted. 
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2.6 ‘The Kepler Problem and the Runge—Lenz 
Vector 


2.6.1 The Kepler Problem 


We consider now the classical Kepler problem, that of finding the 
trajectories of a planet orbiting the sun. Since the sun is very much more 
massive than any of the planets, we may consider the position of the sun 
to be fixed at the origin of our coordinate system. The sun exerts a force 
on a planet given by 


F =—k—y. (2.35) 


Here k = GmM, where m is the mass of the planet, MW is the mass of the 
sun, and G is the universal gravitational constant. Note that the magnitude 
of F is proportional to the reciprocal of the square of the distance from the 
origin; thus, the force follows an inverse square law. Since k contains a 
factor of the mass m of the planet, this quantity drops out of the equation 
of motion, mX = F. The potential associated with the force (2.35) is easily 
seen to be 
k 


V(x) = el’ 


(2.36) 


Since our potential V is invariant under rotations, the angular momentum 
vector J = x x p is a conserved quantity (Theorem 2.21 with N = 1 and 
n = 3). If J = 0, the particle is moving along a ray through the origin. 
In that case, either the particle will pass through the origin at some point 
in the future (if the initial momentum points toward the origin), or else 
the particle must have passed through the origin at some point in the past 
(if the initial momentum points away from the origin). Trajectories of this 
sort are called collision trajectories, and we will regard such trajectories as 
pathological. 

We will, from now on, consider only trajectories along which the angular 
momentum vector is nonzero. Fixing the energy and angular momentum of 
the particle guarantees that the particle stays a certain minimum distance 
from the origin (Exercise 20). Meanwhile, since J = x x p, the position 
x(t) of the particle will always be perpendicular to the constant value of J. 
We will therefore refer to the plane (through the origin) perpendicular to 
J as the “plane of motion.” 


2.6.2 Conservation of the Runge-Lenz Vector 


We are going to obtain a description of the classical trajectories in an 
indirect way, using something called the Runge—Lenz vector. 
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Definition 2.33 The Runge—Lenz vector is the vector-valued function 
on R?\{0} x R® given by 
x 


1 
—pxJ-—. 
mk? * |x| 


A(x, p) = 


Here x represents the position of a classical particle and p its momentum. 


The significance of this vector is that it is a conserved quantity for the 
Kepler problem. Of course, whenever the potential energy is radial (a func- 
tion of the distance from the origin), the angular momentum vector is a 
conserved quantity. What is special about the 1/r potential of the Kepler 
problem is that there is another conserved vector-valued quantity. 


Proposition 2.34 The Runge—Lenz vector is conserved quantity for New- 
ton’s law with force given by (2.35). 


Proof. Since J is conserved, we compute that 


A(t) = ——F x J 











m |x| x] m "|x|? |x| m 
-3(- pm w+ Pe) a 
= 0. 


Here we have used the identity b x (c x d) = c(b-d) —d(b-c), which holds 
for all vectors b,c,d € R®. mg 


2.6.8 Ellipses, Hyperbolas, and Parabolas 


We now use the Runge—Lenz vector to determine the trajectories for the 
Kepler problem. 


Proposition 2.35 The magnitude of the Runge—Lenz vector A satisfies 


2|5? 


2: d 





JA)? =1+ 


mk 


where E = |p|? /(2m) — k/ |x| is the energy of the particle. Furthermore, 
if X := x/ |x| is the unit vector in the x-direction, we have 


|5I? 


mk |x| 





A-%= 1 (2.37) 
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for all nonzero x. It follows from (2.87) that 


|5|? 


Ped ey 


Note that from (2.37), A. > —1 for all points (x, p) with x # 0. 
Proof. Using the identity b- (c x d) =d- (b x c), we see that 


&- (px J) =J-(%x p) = [J]? /[xl. 


Since J and p are orthogonal, we get 


1 iar 
JAP = 5 Ipl’|s|? +1 - —&- (px J) 


_, 2a (Ip? _ & 
~  ' mk? \ Im |x| 








Using again the identity for b- (ce x d), we next compute that 


xX: xX 


|x| 


We may now divide by |x| to obtain the desired expression for A - X. It is 
then straightforward to solve for |x|. ™ 


Corollary 2.36 Choose orthonormal coordinates in the plane of motion 
so that A lies along the positive x-axis. If r and @ are the polar coor- 
dinates associated with this coordinate system, then along each trajectory 
(r(t), A(t)), we have 

|5)? 1 

mk 1+ Acos6(t)’ 





r(t) = (2.38) 


where A =|A]. 
If A = 0, any orthonormal coordinates can be used. 


Proposition 2.37 If A :=|A| <1, (2.38) is the equation of an ellipse with 
eccentricity A and with the origin being one focus of the ellipse. If A > 1, 
(2.38) is the equation of a hyperbola, and if A = 1, (2.38) is the equation 
of a parabola. 

The orbit of the particle in the plane of motion is an ellipse if the energy 
of the particle is negative, a hyperbola if the energy is positive, and a 
parabola if the energy is zero. 
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FIGURE 2.2. Elliptical orbit for the Kepler problem, with two equal areas shaded. 


Kepler’s first law is the assertion that planets move in elliptical trajec- 
tories with the sun at one focus, as shown in Fig. 2.2. The shaded regions 
indicate two equal areas that are swept out in equal times, in accordance 
with Kepler’s second law (Corollary 2.19). 

Recall that the eccentricity of an ellipse is \/1 — (b/a)?, where a is half 
the length of the major axis and 0 is half the length of the minor axis. 
Thus, when A = 0, we have b = a, meaning that the ellipse is a circle. 
Proof. We continue to work in a coordinate system in which A is along 
the positive x,-axis. Then (2.38) becomes 








where a = |J|? /(mk). From this we obtain 
i ————— 
1=— (V2? +y? + Az) ‘ 
a 


Now we can solve for ,/x? + y?, square both sides of the equation, and 
simplify. Assuming A? 4 1, we obtain 


a? (2) = (1— A’) («+ =) +y?. (2.39) 


This is the equation of an ellipse (if A? < 1) or a hyperbola (if A? > 1), 
where the center of the ellipse or hyperbola is the point (—a/(1 — A”), 0). 
In light of the formula for A := |A| in Proposition 2.35, we obtain an ellipse 
if the energy of the particle is negative and a hyperbola if the energy is 
positive. 

In the case A? < 1, we may readily compute the half-lengths a and b of 
the major and minor axes as 


Qa a 


= ——?: b ee ——lt 

ST T=? Vm 
From this, we readily calculate that the eccentricity is A. Now, the distance 
between the foci of an ellipse is the length of the major axis times the 
eccentricity, in our case, 2Aa/(1 — A”). Since the center of the ellipse in 
(2.39) is at the point (Aa/(1— A”), 0), the origin is one focus of the ellipse. 
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If A? = 1, then when we perform the same analysis, x? drops out of the 
equation and we obtain 


1 
0 gg Pe 
which is the equation of a parabola opening along the z-axis. This case 
corresponds to energy zero. @ 

Note that Proposition 2.37 does not tell us how the particle moves along 
the ellipse, hyperbola, or parabola as a function of time. We can, however, 
determine this, at least in principle, by making use of the angular momen- 
tum. After all, applying (2.17) in the plane of motion gives 


(2.40) 


where @ is the polar angle variable in the plane of motion. Since we have 
computed r as a function of 6 in Corollary 2.36, (2.40) gives us a (first- 
order, separable) differential equation, from which we can attempt to solve 
to obtain 6—and thus also r—as a function of t. 


2.6.4. Special Properties of the Kepler Problem 


As we have said, the existence of another conserved vector-valued function— 
in addition to the conserved energy and angular momentum—is special to 
a potential of the form —k/ |x|. For a general radial potential, the energy 
and the angular momentum will be the only conserved quantities. Assuming 
J #0, the motion of a particle in any radial potential will always lie in the 
plane perpendicular to J. Taking this into account, we think of our particle 








FIGURE 2.3. Trajectory in the plane of motion for a typical radial potential. 
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as moving in R? rather than R?, and accordingly think of our phase space 
as being four-dimensional rather than six-dimensional. From this point of 
view, there are two remaining conserved quantities, the energy FE and the 
scalar angular momentum J in the plane, as given by Definition 2.17. Thus, 
each trajectory will lie in a set of the form 


{ (x, p) € R* x R?| E(x, p) =a, J(x,p) =)}. 


We refer to such a set as a joint level set of EK and J. These sets are two- 
dimensional surfaces inside our four-dimensional phase space. 

For a general radial potential, a trajectory (x(t), p(t)) in phase space 
may not be a closed curve, but may fill up a dense subset of the joint 
level surface on which it lives. In particular, the trajectory x(t) in position 
space will typically not be a closed curve. For example, x(t) may trace out 
a roughly elliptical region in the plane, but where the axes of the ellipse 
“precess,” that is, vary with time. Such a trajectory is shown in Fig. 2.3, 
which should be contrasted with Fig. 2.2. 

In the Kepler problem, even after restricting attention to the plane of 
motion, we still have one conserved quantity in addition to E and J, namely 
the direction of A, which can be expressed in terms of the angle ¢ between 
A and the x,-axis in the plane of motion. (Note that both terms in the 
definition of A lie in the plane of motion. Note also that the magnitude of A 
is, by Proposition 2.35, computable in terms of E and J.) The trajectories 
of the Kepler problem, then, lie in the joint level sets of E and J and ¢, 
which are one-dimensional. When FE < 0, the joint level sets of EF and J are 
compact, in which case the joint level sets of E and J and ¢ are compact 
and one-dimensional, that is, simple closed curves. 

Another special property of the Kepler problem is that the period of the 
closed trajectories (the trajectories with negative energy) is the same for all 
trajectories with the same energy (Exercise 21). This apparent coincidence 
can be explained by showing that the Hamiltonian flows (Definition 2.28) 
generated by J and A act transitively on the energy surfaces. These flows 
commute with the time evolution of the system, because they are all con- 
served quantities (Conclusion 2.31). Thus, any two points with the same 
energy are “equivalent” with respect to time evolution. Although we will 
not go into the details of this analysis, we will gain a better understanding 
of the flows generated by the components of A in Sect. 18.4. 


2.7 Exercises 


1. Consider a particle moving in the real line in the presence of a force 
coming from a potential function V. Given some value Eo for the 
energy of the particle, suppose that V(a) < Eo for all x in some 
closed interval [29,21]. Then a particle with initial position xo and 
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positive initial velocity will continue to move to the right until it 
reaches 21. Using (2.6), show that the time needed to travel from 29 


to x1 is given by 
Ly ™m 
i= | dy. 
nm V 2(Eo — V(y)) 


Note: This shows that we can solve Newton’s equation in R! more 
or less explicitly for time as a function of position, which in principle 
determines the position as a function of time. 





. In the notation of the previous problem, suppose now that V(x) < Ep 
for ro < x < 2, but that V(x1) = Ep. 


(a) Show that if V’(a#1) 4 0, then the particle reaches 2 in a finite 
time. 


(b) Show that if V’(a,) = 0, then the time it takes the particle to 
reach 2, is infinite; that is, the particle approaches but never 
actual reaches 21. 


Note: In Part (b), the point 2 is an unstable equilibrium for the 
system, that is, a critical point for V that is not a local minimum. 


. Consider the equation of motion of a pendulum of length L, 


(Og. 
Te =F 7 sind =0, 


where g is the acceleration of gravity. Here 6 is the angle between the 
pendulum and the negative y-axis in the plane. This system has a 
stable equilibrium at @ = 0 and an unstable equilibrium at 6 = 7. 


Consider initial conditions of the form 6(0) = x — 6, 6(0) = 0, for 
0<06< 7/4. Fix some angle 09 and let T(5) denote the time it takes 
for the pendulum with the given initial conditions to reach the angle 
Oo. (Here 4 represents an arbitrarily chosen cutoff point at which the 
pendulum is no longer “close” to 0 = 7.) Show that T(d) grows only 
logarithmically as 6 tends to zero. 


Note: Logarithmic growth of T as a function of 6 corresponds to 
exponential decay of 6 as a function of 7. Thus, if we want T to be 
large, we must choose 6 to be very small. 


. Consider a particle moving in the real line in the presence of a 
“repelling potential,” such that there is an A with V’(x) < 0 for 
all « > A. Then a particle with initial position x) > A and positive 
initial velocity will have positive velocity for all positive times. Sup- 
pose now that V(x) = —x® for all a > 1, for some positive constant 
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a. Suppose also that the particle is given initial position 79 > 1 and 
positive initial velocity. Show that for a > 2, the particle escapes to 
infinity in finite time, but that for a < 2, the position of the particle 
remains finite for all finite times. 


Hint: Use Problem 1. 


. Consider the equation m% + ya + kx = 0, where y and k are positive 


constants (the damping constant and spring constant, respectively). 
Find the critical value y. of y (for a fixed m and k) such that for 
7 < Ye, we get solutions that are sines and cosines times a decaying 
exponential and for y > 7c, we get pure decaying exponentials. 


. Continue with the notation of Exercise 5. Given particular choices 


for m, y, and k, let r be the rate of exponential decay of a “generic” 
solution to the equation of motion. Here, if the solution is of the form 
ae" cos(wt) + be~™ sin(wt), the rate of exponential decay is r. If the 
solution is of the form ae~™* + be~"2*, then r = min(r1,7r2), since 
the slower-decaying term will dominate as long as a and b are both 
nonzero. 


For a fixed value of m and k, show that the maximum value for r 
is achieved by taking y = 7. (This accounts for the terminology 
“critical damping” for the case in which 7 = 7.) 


. Consider the R?-valued function F on R? \ {0} given by 





Flea) = ( a = ) 


x? +03? 2? + 23 
Show that OF /Ox2 — OF2/0x, = 0 but that there does not exist any 
smooth function V on R? \ {0} with F = —VV. 
Hint: If F were of the form —VV, we would have 


b 
V(Ex(5)) — Vex(a)) == f F(%(t))- FF at 


for every smooth path x(-) : [a,b] > R?\{0}, by the fundamental 
theorem of calculus and the chain rule. 


. Consider a particle moving in R” with a velocity-dependent force law 


given by 
F(x, v) = —VV(x) + Fa(x, v), 


where the velocity-dependent term F2 acts perpendicularly to the 
velocity of the particle. (That is, we assume that v - Fo(x,v) = 0 
for all x and v.) Let F denote the usual energy function E(x,v) = 
$m |v|?+V (x), unmodified by the presence of the velocity-dependent 
term in the force. Show that FE is conserved. 
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9. (a) Ifr and @are the usual polar coordinates on R?, compute 00/021 
and 00/022. 


(b) If x(-) denotes the trajectory of a particle of mass m moving in 
R?, show that 


d 1 


Fox) - J (x(t), p(t). 


Ae 

10. Prove Theorem 2.21, by imitating the proof of Proposition 2.18. You 
may assume that every rotation can be built up as a product of 
repeated rotations in the various coordinate planes (i.e., rotations in 
the (#;,2;,) plane, for various pairs (j,k), where the same plane may 
be used more than once). 


11. Consider Hamilton’s equations for N particles moving in R”, as in 
Proposition 2.32. Show that the total momentum p = yy p’ of the 
system is a conserved quantity if and only if the quantity 





Bie +a... 8” bap +a, Hay, ae, 











is independent of a for all x!,...,x% and p!,...,p% in R”. 


Hint: Use (the N-particle version of) Conclusion 2.31. 


12. Let J denote the angular momentum of a particle moving in R?. 
Let Rg denote a counterclockwise rotation by angle 6 in R?. 


(a) If f is any smooth function on R*, show that 


FT} (x,p) = Sf (Rox, Rop) 
@=0 


(b) Let H be any smooth function on R* and consider Hamilton’s 
equations with this function playing the role of the Hamilto- 
nian. Show that J is conserved (i.e., constant in time along any 
solution of Hamilton’s equations) if and only if 


H(Rox, Rep) = H(x, p) 


for all 6 in R and all x and p in R?. (This argument is a more 
explicit way to obtain Conclusion 2.31.) 


13. Suppose that f and g are smooth functions on R?” and that at least 
one of the two functions has compact support. Show that 


I. [ thaves p) dx a’ p=, 


Hint: Use integration by parts or Liouville’s theorem. 
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14. Let X and Y be “vector fields” on R”, viewed as first-order differential 


15. 


16. 


operators. This means that X and Y are of the form 


(If X(x) = (a1(x),... ,@n(x)), then the operator X is the directional 
derivative in the direction of X. It is common to identify the vector- 
valued function X with the associated first-order differential operator 
X.| 


Show that the commutator [X,Y] of X and Y, defined by 
x Vieky = yx 
is again a vector field (i.e., a first-order differential operator). 


Given a smooth function f on R?”, define an operator X +, acting on 
C™(R2”), by the formula 


X 5(9) = {fg}. 
That is to say, 





“(of a Of oO ) 
Xp= . 
f yy (32 Op; Op; Ox; 


The operator Xy is called the Hamiltonian vector field associated 
with the function f. (Here, as in Exercise 14, we identify vector fields 
with first-order differential operators. ) 


(a) Show that for all f,g € C%°(R?”), we have 
X {5,9} = (Xp, Xo), 


where [Xy, X4 = XsXq — XX ff. 

Hint: By Exercise 14, all terms in the computation of [X ¢, X,](h) 
involving second derivatives of h can be neglected, since they will 
always cancel out to zero. 

Use Part (a) to compute {{f, 9}, h} = X¢¢,9}(h) and thereby ob- 
tain another proof of the Jacobi identity for the Poisson bracket. 


(b 


a 


Recall the definition of a Hamiltonian vector field Xf in Exercise 15. 


(a) Consider a smooth vector field X on R? (viewed as a first-order 
differential operator as in Exercise 14) of the form 


0 O 


17. 


18. 


19. 


20. 
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Show that X can be expressed as X = Xy, for some f € 
C™(R?), if and only X is divergence free, that is, if and only 
if 


Hint: As in Proposition 2.7, given a pair of functions h, and hg 
on R?, there exists a function f with Of /Ox = h, and Of /Op = 
he if and only if we have 0h1/Op = Oh2/Oz. 


Show that there exists a smooth vector field _X on R* of the form 


oc 3 (100.52 + sale) 5) 


oc 
~~" 


such that 





a Og; 0g; 
V- X= (SB + 12) <4 
» Ox; Op; 
but such that there does not exist f € C®(R*) with X = X,. 
Hint: You should be able to find a counterexample in which the 
coefficient functions g; are linear. 


Show that the space of homogeneous polynomials of degree 2 on R?” 
is closed under the Poisson bracket. 


Determine the Hamiltonian flow on R? generated by the function 
f(x, p) = xp. 
Let J denote the angular momentum vector for a particle moving in 


R°, namely J = x x p. Show that the components Jj, Jo, and J3 of 
J satisfy the following Poisson bracket relations: 


{Ji, Jo} = J3;  {Jo,Js}= Si; {Js, A} = Jo. 


In the Kepler problem, show that for each real number F and positive 
number J, there exists ¢ > 0 such that for all (x, p) with E(x, p) = E 
and |J(x,p)| = J, we have |x| >. 
Hint: Suppose that (x,,pn) is a sequence with |J(xn,pn)| = J and 
|x,,| tending to zero. Show that E(xn,pn) tends to +00. 
(a) Determine the area of the ellipse in the plane of motion in Propo- 
sition 2.37, in the case A < 1. 
(b) Show that the time T it takes the particle to travel once around 
the ellipse is given by 


GM (-E)~*?, 


ae 
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where FE is the “massless energy” of the particle, given by 
~ KF 1 GM 
E=— = —|x| 
m 2 





[x| 


Note in the case where the trajectory in the plane of motion is 
elliptical, the energy of the particle is negative. 


Note: The result of Part (b) is closely related to Kepler’s third law. 
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A First Approach to Quantum 
Mechanics 


In this chapter, we try to understand the main ideas of quantum mechanics. 
In quantum mechanics, the outcome of a measurement cannot—even in 
principle—be predicted beforehand; only the probabilities for the outcome 
of the measurement can be predicted. These probabilities are encoded in a 
wave function, which is a function of a position variable x € R”. The square 
of the absolute value of the wave function encodes the probabilities for the 
position of the particle. Meanwhile, the probabilities for the momentum of 
the particle are encoded in the frequency of oscillation of the wave function. 
The probabilities can be described using the position operator and the 
momentum operator. The time-evolution of the wave function is described 
by the Hamiltonian operator, which is analogous to the Hamiltonian (or 
energy) function in Hamilton’s equations. 


3.1 Waves, Particles, and Probabilities 


There are two key ingredients to quantum theory, both of which arose from 
experiments. The first ingredient is wave—particle duality, in which objects 
are observed to have both wavelike and particlelike behavior. Light, for 
example, was thought to be a wave throughout much of the nineteenth 
century, but was observed in the early twentieth century to have parti- 
cle behavior as well. Electrons, meanwhile, were originally thought to be 
particles, but were then observed to have wave behavior. 
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The second ingredient of quantum theory is its probabilistic behavior. 
In the two-slit experiment, for example, electrons that are “identically 
prepared” do not all hit the screen at the same point. Quantum theory 
postulates that this randomness is fundamental to the way nature behaves. 
According to quantum mechanics, it is impossible (theoretically, not just 
in practice) to predict ahead of time what the outcome of an experiment 
will be. The best that can be done is to predict the probabilities for the 
outcome of an experiment. 

These two aspects of quantum theory come together in the wave function. 
The wave function is a function of a variable x € R”, which we interpret as 
describing the possible values of the position of a particle, and it evolves in 
time according to a wavelike equation (the Schrédinger equation). The wave 
function and its time-evolution account for the wave aspect of quantum 
theory. The particle aspect of the theory comes from the interpretation of 
the wave function. Although it is tempting to interpret the wave function 
as a sort of cloud, where we have, say, a little bit of electron-cloud over 
here, and little bit of electron-cloud over there, this interpretation is not 
consistent with experiment. Whenever we attempt to measure the position 
of a single electron, we always find the electron at a single point. A single 
electron in the two-slit experiment is observed at a single point on the 
screen, not spread out over the screen the way the wave function is. The 
wave function does not describe something that is directly observable for a 
single particle; rather, the wave function determines the statistical behavior 
of a whole sequence of identically prepared particles. See Fig.1.4 for a 
dramatic experimental demonstration of this effect. 

In the two-slit experiment, for example, it is possible to determine how 
the wave function behaves as a function of time by solving the (determin- 
istic) Schrodinger equation. Knowledge of the wave function of an individ- 
ual electron, however, does not determine where that electron will hit the 
screen. The wave function merely tells us the probability distribution for 
where the electron might hit the screen, something that is only observable 
by shooting a whole sequence of electrons at the screen. 

It is an oversimplification, but a useful one, to describe the wave—particle 
aspect of quantum theory in this way: a single electron (or photon, or 
whatever) acts like a particle, but a large collection of electrons behaves 
like a wave. A single measurement of a single electron always gives its 
position as a point, just as we would expect for a particle. This point, 
however, varies from one electron to the next, even if we shoot each electron 
toward the screen in precisely the same way. Repeated measurements of 
identically prepared electrons give a distribution that can, for example, 
exhibit interference patterns, just as we would expect for a wave. See, again, 
Fig. 1.4, which should be compared to Figs. 1.1 and 1.2. 

It is interesting to note that at the macroscopic scale, where quantum ef- 
fects are not apparent, light appears to be a wave, whereas electrons appear 
to be particles. This is the case even though both light and electrons are 
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really wave—particle hybrids, described in probabilistic terms by a wave 
function. The difference between the two situations is that photons (the par- 
ticles of light) have mass zero, whereas electrons have positive mass. This 
means that photons, unlike electrons, can easily be created and destroyed 
even at low energies. Thus, the discrete aspect of light—namely, that the 
energy in light comes only in discrete “quanta,” namely the photons—is 
less evident than the corresponding discrete aspect of electrons. 


3.2 A Few Words About Operators 
and Their Adjoints 


In quantum mechanics, physical quantities—such as position, momentum, 
and energy—are represented by operators on a certain Hilbert space H. 
These operators are unbounded operators, reflecting that in classical me- 
chanics, these quantities are unbounded functions on the classical phase 
space. In this section, we look briefly at some technical issues related to 
unbounded operators and their adjoints. We will delay a full discussion of 
these technicalities (Chap. 9) until after we have understood the basic ideas 
of quantum mechanics. 

Here and throughout the book, H will represent a Hilbert space over C, 
always assumed to be separable. We follow the convention in the physics 
literature that the inner product be linear in the second factor: 


(6,A¥) =A(b.¥) 3 (AG, ¥) = AO) 


for all ¢,w € Hand all AEC. 

Recall (Appendix A.3.4) that a linear operator A: H > H is bounded 
if there is a constant C such that ||A%|| < C |||] for all 7% © H. For any 
bounded operator A, there is a unique bounded operator A*, called the 
adjoint of A, such that 


(9, Ad) = (A*S, v) 


for all ¢,~ € H. The existence of A* follows from the Riesz theorem (Ap- 
pendix A.4.3), by observing that for each fixed ¢, the map a +> (¢, Aw) 
is a bounded linear functional on H. A bounded operator is said to be 
self-adjoint if A* = A. 

For various reasons, both physical and mathematical, we want the 
operators of quantum mechanics operators to be self-adjoint. Once one 
sees the formulas for these operators, however, one is confronted with a 
serious technical difficulty: the operators are not bounded. 

If A is a linear operator defined on all of H and having the property 
that (¢, Aw) = (Ad, Vv) for all 6, € H, then A is automatically bounded. 
(See Corollary 9.9.) To put this fact the other way around, an unbounded 
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self-adjoint operator cannot be defined on the entire Hilbert space. Thus, to 
deal with the unbounded operators of quantum mechanics, we must deal 
with operators that are defined only on a subspace of the relevant Hilbert 
space, called the domain of the operator. 


Definition 3.1 An unbounded operator A on H is a linear map from 
a dense subspace Dom(A) C H into H. 


More precisely, the operator A is “not necessarily bounded,” since noth- 
ing in the definition prevents us from having Dom(A) = H and having A 
be bounded. 

In defining the adjoint of an unbounded operator, we immediately en- 
counter a difficulty: for a given ¢ € H, the linear functional (¢, A-) may 
not be bounded, in which case we cannot use the Riesz theorem to define 
A*@. What this means is that the adjoint of A, like A itself, will be defined 
not on all of H but only on some subspace thereof. 


Definition 3.2 For an unbounded operator A on H, the adjoint A* of A 
is defined as follows. A vector 6 € H belongs to the domain Dom(A*) of 
A* if the linear functional 


(d, A-) ’ 


defined on Dom(A), is bounded. For ¢ € Dom(A*), let A*@ be the unique 
vector x such that 


(x; ¥) = (¢, AY) 
for all ~ € Dom(A). 


Saying that the linear functional (¢, A-) is bounded means that there is 
a constant C’ such that |(¢, Aw)| < C'||y|| for all ¢ € Dom(A). If (¢, A-) is 
bounded, then since Dom(A) is dense, the BLT theorem (Theorem A.36) 
tells us that (¢, A-) has a unique bounded extension to all of H. The Riesz 
theorem then guarantees the existence and uniqueness of y. The adjoint of 
an unbounded linear operator is a linear operator on its domain. 

We are now ready to define self-adjointness (and some related notions) 
for unbounded operators. 


Definition 3.3 An unbounded operator A on H is symmetric if 


(9, A) = (Ag, b) 


for all ¢,~ € Dom(A). The operator A is self-adjoint if Dom(A*) = 
Dom(A) and A*¢ = Ad for all 6 € Dom(A). Finally, A is essentially 
self-adjoint if the closure in H x H of the graph of A is the graph of a 
self-adjoint operator. 


That is to say, A is self-adjoint if A* and A are the same operator with 
the same domain. Every self-adjoint or essentially self-adjoint operator is 
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symmetric, but not every symmetric operator is essentially self-adjoint. 
For any symmetric operator, Dom(A*) > Dom(A) and A* agrees with A 
on Dom(A). The reason a symmetric operator may fail to be self-adjoint is 
that Dom(A*) may be strictly larger than Dom(A). 

Although the condition of being symmetric is certainly easier to 
understand (and to verify) than the condition of being self-adjoint, self 
adjointness is the “right” condition. In particular, the spectral theorem, 
which is essential to much of quantum mechanics, applies only to operators 
that are self-adjoint and not to operators that are merely symmetric. If A 
is essentially self-adjoint, then we can obtain a self-adjoint operator from 
A simply by taking the closure of the graph of A, and we can then apply 
the spectral theorem to this self-adjoint operator. Thus, for may purposes, 
it is enough to have our operators be essentially self-adjoint rather than 
self-adjoint. 

It is generally easy to verify that the operators of quantum mechanics 
(those representing position, momentum, and so forth) are symmetric on 
some suitably chosen domain. Proving that these operators are essentially 
self-adjoint, however, is substantially more difficult. Although establishing 
essential self-adjointness is a crucial technical issue, it is best not to worry 
too much about it on a first encounter with quantum mechanics. In this 
chapter, we will not concern ourselves overly with technical details con- 
cerning essential self-adjointness and the precise choice of domain for our 
operators, depending on Chap. 9 to take care of such matters. For now, we 
content ourselves with deriving some very elementary properties of sym- 
metric (and thus also self-adjoint) operators. 


Proposition 3.4 Suppose A is a symmetric operator on H. 


1. For all € Dom(A), the quantity (q), Aw) is real. More generally, if 
wp, Ay,...,A™1w all belong to Dom(A), then (W, A™W) is real. 


2. Suppose » is an eigenvector for A, meaning that AW = AW for some 
nonzero w € Dom(A). Then A € R. 


Proof. Since A is symmetric, we have 


(b, A) = (As, b) = (b, AY) 


for all w € Dom(A). If v, Ay,..., A™~!w all belong to the domain of A, 
we can use the symmetry of A repeatedly to show that 


(p, A™)) = (A™Y, b) = (pb, AY). 


Meanwhile, if 7 is an eigenvector for A with eigenvalue A, then 


d (ib, ) = (tb, Arp) = (Arb, hb) = A (0b, H) - 


Since y is assumed to be nonzero, this implies that \ =. m 


58 3. A First Approach to Quantum Mechanics 


Physically, (w, Aw) represents—as we will see later in this chapter— 
the expectation value for measurements of A in the state ~, whereas the 
eigenvalue A represents one of the possible values for this measurement. 
On physical grounds, we want both of these numbers to be real. If A is 
self-adjoint, and not just symmetric, then the spectral theorem will give 
a canonical way of associating to each ~ € H a probability measure on 
the real line that encodes the probabilities for measurements of A in the 
state w. 


3.3 Position and the Position Operator 


Let us consider at first a single particle moving on the real line. The wave 
function for such a particle is a map 7 : R' + C. Although this map will 
evolve in time, let us think for now that the time is fixed. The function 
\¢)(a)|? is supposed to be the probability density for the position of the 
particle. This means that the probability that the position of the particle 
belongs to some set E C R! is 


[wor a. 


For this prescription to make sense, w should be normalized so that 
| |b(x)|? da = 1. (3.1) 
R 


That is, ~ should be a unit vector in the Hilbert space L?(R). 

Now, if the function |q(x)| is the probability density for the position of 
a particle, then according to the standard definitions of probability theory, 
the expectation value of the position will be 


E(«) = [ x \w(a)|? de, (3.2) 


provided that the integral is absolutely convergent. More generally, we can 
compute any moment of the position (i.e., the expectation value of some 
power of the position) as 


E(a") = fe iab(a)|2 dx, (3.3) 


assuming, again, the convergence of the integral. 

A key idea in quantum theory is to express expectation values of various 
quantities (position, momentum, energy, etc.) in terms of operators and 
the inner product on the relevant Hilbert space, in this case, L?(IR). In the 
case of position, we may introduce the position operator X defined by 


(Xp)(a) = ay (a). 
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That is, X is the “multiplication by x” operator. The point of introducing 
this operator is that the expectation value of the position {defined in (3.2)| 
may now be expressed as 


E(x) = (, Xv), 


where the inner product is the usual one on L?(R): 


(¢,%) = ore dz. 


(Recall that we are following the physics convention of putting the conju- 
gate on the first factor in the inner product.) 
We use the following notation for the expectation value of the operator 
X in the state w: 
(X)y = (y, XY). 
The higher moments of the position, as defined in (3.3), are also computable 
in terms of the position operator: 


E(u) = (b, X™y). 


At this point, it is not clear that we have gained anything by writing 
our moments in terms of an operator and the inner product instead of in 
terms of the integral (3.3). The operator description will, however, motivate 
a parallel description of moments for the momentum, energy, or angular 
momentum of a particle in terms of corresponding operators. 

It should be noted that, for a given w € L?(R), X~ might fail to be in 
L?(R). This failure of X to be defined on all of our Hilbert space reflects 
that X is an unbounded operator, something that we discussed briefly in 
Sect. 3.2. Even if X7 is in L?(R), X™w might fail to be in L?(R) for some 
m. Nevertheless, for any unit vector 7 in L?(R), we have a well-defined 
probability density on R, given by |:(zx)|?. 





3.4 Momentum and the Momentum Operator 


At any fixed time, the wave function (x) of a particle (according to the 
wave theory postulated by Schrédinger) is a function of a “position” vari- 
able x only. Although the wave function ~ directly encodes the probabilities 
for the position of the particle, through |:b(x)|’ , it is not as clear how in- 
formation about the particle’s momentum is encoded. As it turns out, the 
momentum is encoded in the oscillations of the wave function. A crucial 
idea in quantum mechanics is the de Broglie hypothesis, which we intro- 
duced in Sect. 1.2.2 as a way of understanding the allowed energies in the 
Bohr model of the hydrogen atom. The de Broglie hypothesis proposes 
a particular relationship between the frequency of oscillation of the wave 
function—as a function of position at a fixed time—and its momentum. 
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Proposition 3.5 (de Broglie hypothesis) If the wave function of a 
particle has spatial frequency k, then the momentum p of the particle is 


p= hk, (3.4) 
where h is Planck’s constant. 


The Davisson—Germer electron-diffraction experiments, described in Sect. 
1.2.3, strongly support not only the idea that electrons have wavelike 
behavior, but also the specific relationship (3.4) between the momentum 
of an electron and the spatial frequency of the associated wave. Of course, 
Proposition 3.5 is rather vague. To be a bit more precise, Proposition 3.5 is 
supposed to mean that a wave function of the form 7(x) = e’** represents 
a particle with momentum p = hk. [Here, as in Chap. 2, “frequency” is in 
the angular sense. The cycles-per-unit-distance frequency is v = k/(27).] 

Now, the function e’** is obviously not square integrable, so it is not 
strictly possible for the wave function [which is supposed to satisfy (3.1)] 
to be e**”, Let us therefore briefly switch to thinking of a particle on a circle, 
so that we can avoid certain technicalities. We think of the wave function 
w for a particle on a circle as a 27-periodic function on R, satisfying the 
normalization condition 


QT 
if Iw(a)[? de = 1. 
0 


For any integer k, it makes sense to say that the normalized wave function 
W(x) = e*** /\/2r represents a particle with momentum p = hk. In this case, 
we are supposed to think that the momentum of the particle is definite, 
that is, nonrandom. If the particle’s wave function is e’**//27, then a 
measurement of the particle’s momentum should (with probability 1) give 
the value hk. 

Now, the functions e’** /\/27, k € Z, form an orthonormal basis for the 
Hilbert space of 27-periodic, square-integrable functions, which may be 
identified with L?((0,27]). Thus, the typical wave function for a particle on 
a circle is 


oo etka 
(a) = py) we (3.5) 


where the sum is convergent in L?((0,2z]). If ~ is normalized to be a unit 
vector, then we have 


Co 


Slee? = llega = t (3.6) 


k=—0o 


For a particle with wave function given by (3.5), the momentum of the 
particle is no longer definite. Rather, we are supposed to think that a 
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measurement of the particle’s momentum will yield one of the values hk, 
k € Z, with the probability of getting a particular value hk being |a,|”. 
Following elementary probability theory, then, the expectation values for 
the momentum should be 

B(p) = S> hk lak), (3.7) 


k=—0o 
and higher moments for the momentum should be 
Co 


E(p™) = $0 (hk) lanl”, (3.8) 


k=—00 


assuming absolute convergence of the sum. 

We would like to encode the moment conditions (3.7) and (3.8) in a 
momentum operator P, which should be defined in such a way that if the 
particle’s wave function ~ is given by (3.5), then E(p™) = (W,P™w). 
We can achieve this relation if P satisfies 


Pei*® = fike™*, (3.9) 
since then, 
(b,P™)) = SY” (fk)™ |ax|? = E(p™). (3.10) 
k=—0o 


The (presumably unique) choice for P satisfying (3.9) is 


d 
P=-ih—. 
da 
Returning now to the setting of the real line, it is natural to postu- 
late that the momentum operator P on the line should also be given by 
P =—-ih d/dz. This operator satisfies the relation 


Petk® = (hk)e aye 


which is supposed to capture the idea that the wave function e’** has 
momentum hk. Although the function e’*” is not square-integrable with re- 
spect to x, the Fourier transform allows us to build up any square-integrable 
function as a “superposition” of functions of the form e’**. (Superposition 
is the term physicists use for a linear combination or the continuous analog 
thereof, namely an integral.) This means that [by analogy to (3.5)] we have 


_ 5 a eike ; 
Ve) = / _ teak) dh, (3.11) 


where ¢(k) is the Fourier transform of 7, defined by 


vk eW***eh(x) dx. (3.12) 


am 
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(See Appendix A.3.2 for information about the Fourier transform.) 
The Plancherel theorem (Theorem A.19) then tells us that the Fourier 
transform is a unitary map of L?(R) onto L?(R). Thus, for any unit vector 


p € L’(R), 
[- Iab(a) |? av = [ r dk = 1. 


d(k) 





In light of what we have in the circle case, it is natural to think that |xb(k)|? 
is essentially the probability density for the momentum of the particle. 
(To be precise, |¢(k)|? is the probability density for p/h.) 

We can now express the properties of the momentum operator entirely 
within the Hilbert space L?(IR), without making explicit mention of the 
non-square-integrable functions e’**. 


Proposition 3.6 Define the momentum operator P by 


d 


P=-ih—. 
: dx 


Then for all sufficiently nice unit vectors ~ in L?(R), we have 


(Puy = f (nay bo] a (3.13) 


—co 





for all positive integers m. The quantity in (3.13) is interpreted as the 
expectation value of the mth power of the momentum, E(p™). 


Equation (3.13) should be compared to (3.10) in the case of the circle. 
Proof. If w is in, say, the Schwartz space (Definition A.15), then, by ap- 
plying Proposition A.17 m times, we see that the Fourier transform of the 
nth derivative of w is (ik)™y(k), and so the Fourier transform of P’y is 
(hk)'4)(k). Meanwhile, since the Fourier transform is unitary, we have 


Pry) =f BEAKER) di, 
which gives (3.13). (The assumption that 7 be in the Schwartz space is 
stronger than necessary. The reader is invited to use integration by parts 


and the definition of the Fourier transform to find weaker assumptions that 
allow the same conclusion.) 


3.5 The Position and Momentum Operators 


In the following definition, we summarize what we have learned, in the two 
previous sections, about the position and momentum operators. 
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Definition 3.7 For a particle moving in R', let the quantum Hilbert space 
be L?(R) and define the position and momentum operators X and P 
by 


XY (x) = arh(x) 


Pu(a) = ine, 


Neither the position nor the momentum operator is defined as mapping 
the entire Hilbert space L?(R) into itself. After all, for ~ € L?(R), the 
function w(x) may fail to be in L?(R). Similarly, a function w in L?(R) may 
fail to be differentiable, and even if it is differentiable, the derivative may fail 
to be in L?(R). What this means is that X and P are unbounded operators, 
of the sort discussed briefly in Sect.3.2. They are defined on suitable dense 
subspaces Dom(X) and Dom(P) of L?(R). We defer a detailed examination 
of the domains of these operators until Chap. 9. 

A vitally important property of this pair of operators is that they do not 
commute. 


Proposition 3.8 The position and momentum operators X and P do not 
commute, but satisfy the relation 


XP — PX =inhl, (3.14) 


This relation is known as the canonical commutation relation. 
Proof. Using the product rule we calculate that 


PXy = ~ih- (2) 


= —ih(z) inne 


= —ihw(x) + X Py, 





from which (3.14) follows. m 

There are many important consequences of the relation (3.14), which we 
will examine at length in Chaps. 11— 14 of the book. For now, we simply note 
a parallel between (3.14) and the Poisson bracket relationship in classical 
mechanics: {x,p} = 1, as follows directly from the definition of the Poisson 
bracket. This hints at an analogy, which we will explore further in Sect. 3.7, 
between the commutator of two operators A and B on the quantum side 
(namely, the operator AB — BA) and the Poisson bracket of two functions 
f and g on the classical side. 


Proposition 3.9 For all sufficiently nice functions @ and w in L?(R), 
we have 


(¢, XY) = (X¢, Y) 
and 


(9, Py) = (Pov). 
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Proof. Suppose that ¢ and w belong to L?(R) and that the functions r¢(z) 
and x7(x) also belong to L?(R). Then since zx is real, we have 





[| Wxv(e) ae = [ roTePw(a) ae 


where both integrals are convergent because they are both integrals of the 
product of two L? functions. 

Meanwhile, for the second claim, let us assume that @ and w are con- 
tinuously differentiable and that $(x) and (x) tend to zero as x tends to 
+too. Let us also assume that ¢, ~, dé/dx and dy/dzx belong to L?(R). We 
note that dé/dx is the same as dé/dx. Thus, using integration by parts, 
we obtain 








— — A 
-in ft way dx = —th HeW(@)| +ih : 


Under our assumptions on ¢ and w, as A tends to infinity, the bound- 
ary terms will vanish and the remaining integrals will tend (by dominated 
convergence) to integrals over the whole real line. Thus, 


i. a) y(-in 4) aw=in fe ids 


which is the second claim in the proposition. m 

In the language of Definition 3.3, Proposition 3.9 means that X and P 
are symmetric operators on certain dense subspaces of L?(R) (the space of 
functions for which the proposition is proved). It is actually true that X 
and P are essentially self-adjoint on these domains. The proof of essential 
self-adjointness, however, will have to wait until Chap. 9. 


3.6 Axioms of Quantum Mechanics: Operators 
and Measurements 


In this section we consider the general “axioms” of quantum mechanics. 
These axioms are not to be understood in the mathematical sense as rules 
from which all other results are derived in a strictly deductive fashion. 
Rather, the axioms are the main principles of how quantum mechanics 
works. Here we look at the “kinematic” axioms, those that apply at one 
fixed time. There is one additional axiom, governing the time-evolution of 
the system, which we consider in the next section. 
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Axiom 1 The state of the system is represented by a unit vector w in an 
appropriate Hilbert space H. If w, and wz are two unit vectors in H with 
we = cy, for some constant c € C, then y, and wz represent the same 
physical state. 


The Hilbert space H is frequently called the “quantum Hilbert space.” 
This does not, however, mean that H is some variant of the notion of a 
Hilbert space, the way a quantum group is a variant of the notion of a 
group. Rather, “quantum Hilbert space” means simply, “the Hilbert space 
associated with a given quantum system.” 

In Axiom 1, it should be noted that unit vectors in H actually represent 
only the “pure states” of the theory. There is a more general notion of a 
“mixed state” (described by a “density matrix”) that we will consider in 
Chap. 19. We will follow the custom in most physics texts of considering at 
first only pure states. 


Axiom 2 To each real-valued function f on the classical phase space there 
is associated a self-adjoint operator f on the quantum Hilbert space. 


In almost all cases, the operator f is unbounded. This unboundedness 
is unsurprising when we realize that physically relevant functions f on 
the classical phase space (e.g., position and momentum) are unbounded 
functions. In the unbounded case, the notion of self-adjointness is rather 
technical; see Definition 3.3 in Sect.3.2. In most applications, it is not 
really necessary to define 7 for all functions on the classical phase space, 
but only for certain basic functions, such as position, momentum, energy, 
and angular momentum. We will describe the quantizations of these basic 
functions in this chapter. If one really needs to define f for an arbitrary 
function f (satisfying some regularity assumptions), the standard approach 
is to use the Weyl quantization scheme, described in Chap. 13. 

For a particle moving in R!, the classical phase space is R?, which we 
think of as pairs (a,p) with 2 being the particle’s position and p being 
its momentum. The quantum Hilbert space in this case is usually taken 
to be L?(R) [not L?(R?)]. In that case, if the function f in Axiom 2 is 
the position function, f(z,p) = x, then the associated operator f is the 
position operator X, given by multiplication by a. If f is the momentum 
function, f(x,p) = p, then f is the momentum operator P = —ih d/dz. 

In the physics literature, a function f on the classical phase space is called 
a classical observable, meaning that it is some physical quantity that could 
be observed by taking a measurement of the system. The corresponding 
operator f is then called a quantum observable. 


Axiom 3 I[f a quantum system is in a state described by a unit vector 
w © A, the probability distribution for the measurement of some observable 
f satisfies 


B(f") = (bv). (3.15) 
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In particular, the expectation value for a measurement of f is given by 
(v, fw) (3.16) 


Note that we have adopted the point of view that even in a quantum 
mechanical system, what one is measuring is the classical observable f. 
In the quantum case, however, f no longer has a definite value, but only 
probabilities, which are encoded by the quantum observable . and the 
vector = € H. 

If w is a nonzero vector in H but not a unit vector, then (3.16) should 
be replaced by 


where 4) := w/ ||| is the unit vector associated with w. It is convenient to 
assume that our vectors have been normalized to be unit vectors, simply 
to avoid having to divide by (w,w) in our expectation values. 

Since ‘i is assumed to be self-adjoint and every self-adjoint operator is 
symmetric, Proposition 3.4 tells us that the moments E(f™), and in partic- 
ular the expectation value E(f), are real numbers. Since f is assumed to be 
self-adjoint and not just symmetric, the spectral theorem (Chaps. 7 and 10) 
will give a canonical way of constructing a probability measure j14,,, on R 
that may be interpreted as the probability distribution for measurements 
of A in the state q. 

Axiom 3 provides motivation for the idea that two unit vectors that differ 
by a constant represent the same physical state. If v2 = cv with |c| = 1, 
then for any operator A, we have 


(2, Aba) = (chr, Acti) = lel” (i, Adi) = (hi, Ay). 


Thus, the expectation values of all observables are the same in the state 
qe as in the state yy. 


Notation 3.10 /f A is a self-adjoint operator on H and w € H is a unit 
vector, the expectation value of A in the state yw is denoted (A)y and is 
defined (in light of Axiom 3) to be 


(A)y, = (wb, Ay). (3.17) 


Proposition 3.11 (Eigenvectors) If a quantum system is in a state 


described by a unit vector  € H and for some quantum observable f we 
have fw = Aw for some X ER, then 


Bp") = ((fy") =” (3.18) 


for all positive integers m. The unique probability measure consistent with 
this condition is the one in which f has the definite value X, with probabil- 
aty one. 
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What the proposition means is that if ~ is an eigenvector for if then 
measurements of f for a particle in the state ~ are not actually random, 


but rather always give the answer of X. If fa = Av, then (v, (fy"w) = 


A™ (ww) = XA”. Thus, by (3.15), we want to find a probability measure ju 
on R such that 


| z™ du=2", (3.19) 
R 


for all non-negative integers m. The proposition is claiming that there is 
one and only one such measure, namely the d-measure at the point X. 
Because ti is assumed to be self-adjoint and therefore symmetric, Propo- 
sition 3.4 thus tells us that the every eigenvalue for f is real. 
Proof. The relation (3.18) follows from (3.15) and the fact that fy = 
Aw. Meanwhile, if 4 is the d-measure at A, then certainly (3.19) holds. 
Meanwhile, since the mth moment grows only exponentially with m, even 
the most elementary uniqueness results for the moment problem show that 
the d-measure is the only measure with these moments. (See, e.g., Theorem 
8.1 in Chap. 4 of [18].) 
If, more generally, the state of the system is a linear combination of 
eigenvectors for 7 , measurements of f will no longer be deterministic. 


Example 3.12 Suppose f has an orthonormal basis {e;} of eigenvectors 
with distinct (real) eigenvalues ;. Suppose also that w is a unit vector in 
H with the expansion 


j=l 


Then for a measurement in the state w of the observable f, the observed 
value of f will always be one of the numbers A;. Furthermore, the probability 
of observing the value r; 1s given by 


Prob{ f = Aj} = |a,|’. (3.21) 

Assuming that w is in the domain of ( fy", it is easy to verify that the 
probabilities in (3.21) are consistent with the expectation values given in 
Axiom 3. After all, if w is given as in (3.20), then we can readily calculate 
that (wb, (f)™w) equals 7 |a;|? x", which is nothing but the mth moment 
associated with the probability distribution in (3.21). In general, we can- 
not quite derive (3.21) from Axiom 3, since the uniqueness results for the 
moment problem might not apply. Nevertheless, (3.21) is the most natural 
candidate for the probabilities, and we will assume that this formula holds. 
It is not difficult to extend Example 3.12 to the case where the eigenvalues 
are not distinct: For any sequence {\,;} of eigenvalues, the probability of 
observing some value A will be the sum of a; |? over all those values of 7 
for which A; = A. For any self-adjoint operator A, the spectral theorem 
implies that A has either an orthonormal basis of eigenvectors or some 
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continuous analog thereof. In particular, given a self-adjoint operator A 
and a unit vector wv € H, the spectral theorem will give us a probability 
measure us on R that we interpret as describing the probabilities for a 
measurement of A in the state y. See Proposition 7.17 in the bounded case 
and Definition 10.7 in the unbounded case. 


Axiom 4 Suppose a quantum system is initially in a state w and that a 
measurement of an observable f is performed. If the result of the measure- 
ment is the number X € R, then immediately after the measurement, the 
system will be in a state w' that satisfies 


Fpl = do. 


The passage from w to wy’ is called the collapse of the wave function. Here 
f is the self-adjoint operator associated with f by Axiom 2. 


Let us assume again that a has an orthonormal basis of eigenvectors {e,} 
with distinct eigenvalues A;. Then we can say, more specifically, that if we 
observe the value A; in a measurement of f (and we will always observe 
one of the \;’s) then 7’ = e;. That is, the measurement “collapses” the 
wave function by throwing away all the components of w in the direction 
of the e,’s, except the one with k = j. 

This idea of the collapse of the wave function has generated an enormous 
amount of discussion and controversy. One way to look at the situation is 
to think that the wave function w is not actually the state of the system— 
although we continue to use the standard physics term, “state.” Rather, 
the wave function is the thing that encodes the probabilities for the state of 
the system. The collapse of the wave function is then something similar to 
a conditional probability; the probabilities for future measurements of the 
system should be consistent with the outcome of the measurement we just 
made. Paul Dirac has described the collapse of the wave function as being 
not a discontinuous change in the state of the system, but a discontinuous 
change in our knowledge of the state of the system. 

In any case, Axiom 4 guarantees the following reasonable principle: If 
we measure f and then measure f again a very short time later, the result 
of the second measurement will agree with the result of the first measure- 
ment. Thus, immediately after the first measurement, the probabilities for 
a second measurement of f are not those associated with the vector 7, but 
rather those associated with the state 7’. (Since w’ is an eigenvector for f 
with eigenvalue A, Proposition 3.11 tells us that measurements of f in the 
state ~’ always give the value of 4.) 

Note that Axiom 4 only tells us something about the state of the system 
immediately after a measurement. Following the measurement, the state of 
the system will evolve in time in the usual way (Sect.3.7). A significant 
time after the measurement, then, the system will probably no longer be 
in the state w’. 
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Let us conclude this section by considering an example of how one makes 
a measurement of a real-world physical system, namely, the hydrogen atom. 
The Hamiltonian operator H for a hydrogen atom has negative eigenvalues 


of the form 
R 


~ 7a (3.22) 
where R is the Rydberg constant and n = 1,2,3,... These energies will be 
derived in Chap. 18. Negative eigenvalues are of greater interest than posi- 
tive ones, because negative eigenvalues describes states where the electron 
is bound to the nucleus. If an electron is placed into a state having energy 
—R/nz, with ny > 1, it will eventually “decay” into a state with lower 
energy, say, —R/n3, with ng < n1. (The most readily observed cases are 
those with no = 2 and nz = 1.) In the process of decaying, the electron 
emits a photon, with the energy of the photon being equal to the change 
in energy of the electron, namely, 


Ephoton =" 377 
m5 


R 
eh (3.23) 
Meanwhile, the frequency of the photon is proportional to its energy. Thus, 
by observing the frequency of the emitted photon, one can determine the 
change in energy of the electron and thus determine the values of n, and ng. 

A general “bound state” of the hydrogen atom (a state in which the 
electron is bound to the nucleus), will be a linear combination of eigenvec- 
tors for H with various different eigenvalues of the form (3.22). To measure 
the energy of the electron, we simply wait for the electron to decay into a 
lower-energy state and emit a photon, observe the frequency of the photon, 
and work backwards to the energy of the electron. If we consider many 
“identically prepared” electrons, all having the same wave function that 
is a linear combination of eigenvectors, we will observe many different fre- 
quencies for the emitted photons, and thus many different energies for the 
electron. The probabilities for the observed energies of the electron will 
follow the principle spelled out in Example 3.12. 

In basic probability theory, if Y is a random variable then the variance 
a? of Y is computed as 


o =B[(Y-B(Y))], 


where £ denotes the mean or expectation value of a random variable. The 
standard deviation o := Vo? is a measure of the “typical” deviation from 
the mean E(X). Observe that the variance may be computed as 
o? = E|Y? -2E(Y)Y + E(Y)?] 
= E(Y*) -2E(Y)?+E(vY/)? 
= E(Y*)- E(Y)’. (3.24) 
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Definition 3.13 If A is a self-adjoint operator on a Hilbert space H and 
w is a unit vector in H, let AyA denote the standard deviation associated 
with measurements of A in the state w, which is computed as 


(Ay A)” = ((A~(4)y1)?) | 


2 
= (42), — ((A),) 
We refer to AyA as the uncertainty of A in the state w. 


For any single observable A, it is possible to choose ~ so that AyA 
is as small as we like. In Chap.12, however, we will see that when two 
observables A and B do not commute, then Ay A and A,B cannot both 
be made arbitrarily small for the same w. In particular, we will derive there 
the famous Heisenberg uncertainty principle, which states that 


(AvX)(AvP) 2 5, 


for all y for which A,X and A,,P are defined. 


3.7 Time-Evolution in Quantum Theory 


8.7.1 The Schrodinger Equation 


Up to now, we have been considering the wave function ~ at a fixed time. 
We now consider the way in which the wave function evolves in time. Recall 
that in the Hamiltonian formulation of classical mechanics (Sect. 2.5), the 
time-evolution of the system is governed by the Hamiltonian (energy) func- 
tion H, through Hamilton’s equations. According to Axiom 2, there is a 
corresponding self-adjoint linear operator H on the quantum Hilbert space 
H, which we call the Hamiltonian operator for the system. See Sect. 3.7.4 
for an example. 

Recall that we motivated the definition of the momentum operator by 
the de Broglie hypothesis, p = hk, where k is the spatial frequency of the 
wave function. We can similarly motivate the time-evolution in quantum 
mechanics by a similar relation between the energy and the temporal fre- 
quency of our wave function: 


E = hw. (3.25) 


This relationship between energy and temporal frequency is nothing but the 
relationship proposed by Planck in his model of blackbody radiation (Sect. 
1.1.3). Suppose that a wave function wo has definite energy E, meaning 
that Wo is an eigenvector for H with eigenvalue E. Then (3.25) means that 
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the time-dependence of the wave function should be purely at frequency 
w = E/h. That is to say, if the state of the system at time t = 0 is Wo, then 
the state of the system at any other time ¢ should be 


p(t) = ey = ea, (3.26) 
We can rewrite (3.26) as a differential equation: 


dw iE E 
dt = h Vs ih 

Note that we are taking “temporal frequency w” to mean that the time- 
dependence is of the form e~*”', whereas we took “spatial frequency k” to 
mean that the space-dependence is of the form e’**, with no minus sign in 
the exponent. This curious convention is convenient when we look at pure 
exponential solutions to the free Schrodinger equation (Chap. 4) of the form 
exp[i(ka — wt)], which describes a solution moving to the right with speed 
w/k. 

Equation (3.27) tells us the time-evolution for a particle that is initially 
in a state of definite energy, that is, an eigenvector for the Hamiltonian 
operator. A natural way to generalize this equation is to recognize that Ew 
is nothing but Hw, since w is just a multiple of 79, which is an eigenvector 
for H with eigenvalue E. Replacing E by H in (3.27) leads to the following 
general prescription for the time-evolution of a quantum system. 





w. (3.27) 


Axiom 5 The time-evolution of the wave function w in a quantum system 
is given by the Schrodinger equation, 


dip 


ilies 

—Hy. 3.28 
dt ih i ( ) 
Here H is the operator corresponding to the classical Hamiltonian H by 
means of Axiom 2. 


Although both Hamilton’s equations and the Schrédinger equation 
involve a Hamiltonian, the two equations otherwise do not seem parallel. 
Of course, since quantum mechanics is not classical mechanics, we should 
not expect the two theories to have the same time-evolution. Neverthe- 
less, we might hope to see some similarities between the time-evolution of 
a classical system and that of the corresponding quantum system. Such 
a similarity can be seen when we consider how the expectation values of 
observables evolve in quantum mechanics. 


Proposition 3.14 Suppose w(t) is a solution of the Schrodinger equation 
and A is a self-adjoint operator on H. Assuming certain natural domain 
conditions hold, we have 


1 x 
a= (GA) (3.29) 
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where (A), is as in Notation 3.10 and where [-,-| denotes the commutator, 
defined as 
[A, B] = AB-— BA. 


Equation (3.29) should be compared to the way a function f on the clas- 
sical phase space evolves in time along a solution of Hamilton’s equations: 
df /dt = {f, H}. We see, then, that the commutator of operators (divided 
by ih) plays a role in quantum mechanics similar to the role of the Poisson 
bracket in classical mechanics. 

Proof. Let y(t) be a solution to the Schrédinger equation and let us com- 
pute at first without worrying about domains of the operators involved. If 
we use the product rule (Exercise 1) for differentiation of the inner product, 


we obtain 
£ (w(t), Av(t) as Av) + (¥ at) 


t = 5 (Hd, Ap) ae > (, AH) 


> 1 (v,[A,H¢), 


where in the last step we have used the self-adjointness of H to move it 
to the other side of the inner product. Recall that we are following the 
convention of putting the complex conjugate on the first factor in the inner 
product, which accounts for the plus sign in the first term on the second 
line. Rewriting this using Notation 3.10 gives the desired result. 

If A and # are (as usual) unbounded operators, then the preceding 
calculation is not completely rigorous. Since, however, we are deferring a 
detailed examination of issues of unbounded operators until Chap. 9, let 
us simply state the conditions needed for the calculation to be valid. For 
every t € R, we need to have 7(t) € Dom(A) 1 Dom(#), we need Ay(t) € 
Dom(/), and we need H(t) € Dom(A). (These conditions are needed for 
[A, H]u(t) to be defined.) In addition, we need Ay)(t) to be a continuous 
path in H. @ 

Note that to see interesting behavior in the time-evolution of a quantum 
system, there has to be noncommutativity present. If all the physically 
interesting operators A commuted with the Hamiltonian operator H, then 
[H, A] would be zero and the expectation values of these operators would 
be constant in time. Noncommutativity of the basic operators is therefore 
an essential property of quantum mechanics. In the case of a particle in 
R', noncommutativity is built into the commutation relation for X and P, 
given in Proposition 3.8. 

Although it is not reasonable to have all physically interesting opera- 
tors commute with H , there may be some operators with this property. If 
[A, H ] = 0, then the expectation value of A (and, indeed, all the moments 
of A) is independent of time along any solution of the Schrédinger equation. 
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We may therefore call such an operator A a conserved quantity (or constant 
of motion). Just as in the classical setting, conserved quantities (when we 
can find them) are helpful in understanding how to solve the Schrédinger 
equation. 

Proposition 3.14 suggests that the map 


(A,B) > =IA, Bl, 


where A and B are self-adjoint operators, plays a role similar to that of the 
Poisson bracket in classical mechanics. This analogy is supported by the 
following list of elementary properties of the commutator, which should be 
compared to the properties of the Poisson bracket listed in Proposition 2.23. 


Proposition 3.15 For any vector space V over C and linear operators A, 
B, and C on V, the following relations hold. 


1. [A,B + aC] = [A, B] + a[A, C] for alla eC 
2. [B, A] = —[A, B] 


3. [A, BC] = [A, B]C + BIA, C] 





4. [A,[B, Cl] = [[A, B], C] + [B, [A, C]] 
Property 4 is equivalent to the Jacobi identity, 
[A, [B, C]] + [B, [C, A] + [C; [A, B]] = 9, (3.30) 


as can easily be seen using the skew-symmetry of the commutator. 
Proof. The first two properties of the commutator are obvious, and the 
third is easily verified by writing things out. Property 4 can also be proved 
by writing things out, but it is slightly messier. Each of the three double 
commutators on the left-hand side of (3.30) generates four terms, for a total 
of 12 terms. Each term has the operators A, B, and C' multiplied together 
in some order. It is a straightforward but unenlightening calculation to 
verify that each of the six possible orderings of A, B, and C occurs twice, 
with opposite signs. m 

If A and B are bounded self-adjoint operators on some Hilbert space, 
then it is straightforward to check that (1/(#h))[A, B] is again self-adjoint 
(Exercise 3). If A and B are unbounded self-adjoint operators, then the 
operator (1/(2f))[A, B] will be self-adjoint under suitable assumptions on 
the domains of A and B. 


Proposition 3.16 If é(t) and W(t) are solutions to the Schrodinger equa- 
tion (3.28), the quantity ((t),(t)) is independent of t. In particular, 
||~(t)|| ts independent of t, for any solution y(t) of the Schrodinger equation. 
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Proof. Using again the product rule, we have 


5 (oO.0) = (HO) H)) + (010), GAO) 


dt ih 
=— + (Ao(t), vo) + & (o(t), AO) 


Since H is self-adjoint, we can move H to the other side of the inner product 
and the derivative is equal to 0. m 


8.7.2 Solving the Schrodinger Equation by Exponentiation 


The Schrédinger equation is an example of a equation of the form 


(3.31) 


where A is a linear operator on a Hilbert space. (In the Schrédinger case, 
we have A = —(i/h)H.) Let us think of (3.31) in the case where the Hilbert 
space is the finite-dimensional space C”. In that case, we can think of A as 
an n X n matrix, in which case (3.31) is the sort of equation encountered 
in the elementary theory of ordinary differential equations. The solution of 
this system (in the finite-dimensional case) can be expressed as 


v(t) = et4u9, 


where the matrix exponential e’4 is defined by a convergent power series 
and where vp = v(0) is the initial condition. If A is diagonalizable, then 
the exponential can by computed by using a basis of eigenvectors. (See 
Sect. 16.4 for more information.) 

The Schrédinger equation simply replaces C” by a Hilbert space H and 
the matrix A by the linear operator —(i/h)H. 


Claim 3.17 Suppose H isa self-adjoint operator on H. If a reasonable 
meaning can be given to the expression e~#/", then the Schrodinger equa- 
tion can be solved by setting 


w(t) =e H/o. (3.32) 


To see why the claim should be true, we expect that we can differentiate 
the operator-valued expression e~*#/" with respect to t as we would in the 
finite-dimensional case. The differentiation, then, would pull down a factor 
of —iH/h, which would indicate that 7)(t) indeed solves the Schrodinger 
equation. Furthermore, when t = 0, e~#/" should be equal to J, so that 
w(0) is indeed wo. 

If H is a bounded operator (which is rarely the case), then the expo- 
nential e~*#/" can be defined by a convergent power series, precisely as 
in the finite-dimensional case. In that case, Claim 3.17 is an easily proved 


theorem. 
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In the more typical case where H is unbounded, convergence of the series 
for the exponential is a rather delicate matter, and it is better instead to 
use the spectral theorem. We leave a general discussion of the spectral 
theorem to Chaps.7 and 10, and here consider only the case of a pure 
point spectrum. A (possibly unbounded) self-adjoint operator H is said to 
have a pure point spectrum if there exists an orthonormal basis {e;} for H 
consisting of eigenvectors for H. If He; = Eje,; for some E; € R, then the 
exponential can be defined by requiring that 

et /Ke, = e tH /Me (3.33) 
The operator e~‘*#/" ig unitary and thus bounded; it is the unique bounded 
operator on Hi satisfying (3.33). 

It is not precisely true that every self-adjoint operator has an orthonor- 
mal basis of eigenvectors, even if the operator is bounded. Nevertheless, 
given a self-adjoint operator A, the spectral theorem tells us that there is a 
decomposition of H into “generalized eigenspaces” for A. It is, however, a 
bit complicated to state the precise sense of this decomposition, especially 
in the case of unbounded operators. Still, Claim 3.17 allows us to identify 
one goal for the spectral theorem: Whatever the spectral theorem says, it 
ought to allow us to make sense of the expression e’*“, for any self-adjoint 
operator A and real number a. This goal will indeed be realized, in the 
bounded case in Chap.7 and in the unbounded case in Chap. 10. 

We should add two points of clarification regarding the expression (3.32). 
First, in writing (3.32), we have not “really” solved the Schrédinger equa- 
tion. For this expression to be useful, we need to compute e~**”/" in some 
relatively explicit way. If, for example, we can actually compute an or- 
thonormal basis of eigenvectors for H, then in light of (3.33), we are on 
our way to understanding the behavior of the operator e~‘*”/", Second, 
although H is an unbounded operator, which is not defined on all of H 
but only on a dense subspace, the operator e~*”/" is unitary and de- 
fined on all of H. Thus, the right-hand side of (3.32) makes sense for any 
wo in H. Nevertheless, we cannot expect that e~*7/"qy9 actually solves the 
Schrédinger equation (in the natural Hilbert space sense) unless 79 belongs 
to the domain of H. (See Lemma 10.17 in Sect. 10.2.) 


8.7.8 Eigenvectors and the Time-Independent Schrodinger 
Equation 
As we saw in the preceding section, eigenvectors for the Hamiltonian oper- 


ator are of great importance in solving the Schrodinger equation. In light 
of this fact, we make the following definition. 
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Definition 3.18 If H is the Hamiltonian operator for a quantum system, 
the eigenvector equation 


Hp=Ev, EER, (3.34) 
is called the time-independent Schrédinger equation. 


As always in eigenvector equations, we are trying to determine both the 
numbers F for which (3.34) has a nonzero solution (the eigenvalues) and the 
corresponding vectors w (the eigenvectors). When quantum texts speak of 
“solving,” say, the quantum harmonic oscillator, what they usually mean is 
finding all of the solutions to the time-independent Schrédinger equation. 
(See, e.g., Chaps.5 and 11.) If ~ is a solution to the time-independent 
Schrodinger equation, then the solution to the time-dependent Schrodinger 
equation with initial condition ~ is simply u(t) = e~**”/"q. Since y(t) is 
just a constant multiple of w, we see that w(t) represents the same physical 
state as w. Thus, a solution to the time-independent Schrodinger equation 
is sometimes called a stationary state. 


3.7.4. The Schrédinger Equation in R! 


Let us now consider the simplest example for the Hamiltonian operator 
H. For a particle moving in R?, recall (Sect.3.5) that we have identified 
the position operator X as being multiplication by x and the momentum 
operator as P = —ifi d/dx. The classical Hamiltonian for such a particle 
is typically taken to be of the form H(x,p) = p?/(2m) + V(x), where V is 
the potential energy function. In that case, we may reasonably take 

e Pp? 

A =—+4+V(X),. 

rte 

Here the operator V(X) is simply multiplication by the potential energy 
function V(x). (This operator may also be thought of as the function V 
applied to the operator X in the sense of the functional calculus coming 
from the spectral theorem.) We see, then, that 


Ai)(x) = -———5 + V(a)v (a). (3.35) 


An operator of the form (3.35), or an analogously defined operator in higher 
dimensions, is referred to as a Schrédinger operator. (The term Hamilto- 
nian operator refers more generally to whatever operator governs the time- 
evolution of a quantum system, regardless of its form.) 

If our Hamiltonian is of the form given in (3.35), then the time-dependent 
Schrodinger equation takes the form 


Ov(a,t) ih Ov(a2,t) i 


a Om Oe 5 V (a)o(2, 4), (3.36) 
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which is a linear partial differential equation. By contrast, Newton’s 
equation for a particle in R* is a typically nonlinear ordinary differential 
equation. 

For a particle in R', the time-independent Schrédinger equation is an 
ordinary differential equation, one that is linear but that has nonconstant 
coefficients, unless V happens to be constant. For simple examples of the 
potential function V, there are relatively standard methods of ordinary 
differential equations that can be brought to bear on the time-independent 
Schrodinger equation. 


8.7.5 Time-Evolution of the Expected Position 
and Expected Momentum 


Since a quantum particle does not have a fixed position or momentum, it 
does not make sense to ask whether the particle satisfies Newton’s equation. 
It does, however, make sense to ask whether the expected values of the po- 
sition and momentum satisfy Newton’s equation (in the form of Hamilton’s 
equations). 


Proposition 3.19 Suppose w(t) is a solution to the Schrodinger equa- 
tion (3.86) for a sufficiently nice potential V and for a sufficiently nice 
initial condition (0) = wo. Then the expected position and expected mo- 
mentum in the state w(t) satisfy 


5 Xen = = (Pho (3.37) 
5 Phy = — VO) a69: (3.38) 


The assumptions in the proposition are there for two reasons: First, to en- 
sure that H is actually a self-adjoint operator (see Sect. 9.9) and second, to 
ensure that the domain assumptions in Proposition 3.14 are satisfied. If we 
assume, for example, that V(a) is a bounded-below polynomial in x and 
that Wo belongs to the Schwartz space (A.15), then both of these concerns 
will be taken care of. Once these technicalities are addressed, the proof of 
Proposition 3.19 is a straightforward application of Proposition 3.14; see 
Exercise 4. Note that (3.37) says that in a certain sense, the velocity of a 
quantum particle is 1/m times the momentum, just as in the classical case. 

At first glance, it might appear that the pair (X) wt) ; (P) wy) is a solu- 
tion to Hamilton’s equations, and indeed (3.37) is precisely what Hamilton’s 
equations require. To get a solution to Hamilton’s equations, however, we 
would need the right-hand side of (3.38) to equal —V’((X),,q)). But in 
general, 

(V(X) FV (UX) y)- 
Consider, for example, the case V’(r) = x? + x?. If w is an even func- 
tion, then (X),, = 0 and so V’((X),,) = 0. But (x8 ea), will not be 
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zero, because the X* term will be zero and the X? term will be positive. 
We conclude, then, that (X), ,, and (P), ,) usually do not evolve along 
solutions to Hamilton’s equations. 

There is, however, one case in which (V’(X)),, coincides with V'((X),,), 
and that is the case in which V is quadratic, in which case V’ is linear. In 
that case we have 


(V'(X))y = (aX +b1), =a(X), +b =V'"((X),). 


Thus, the expected position and expected momentum do follow classical 
trajectories in the case of a quadratic potential. It is not surprising that 
this case is special in quantum mechanics, since it is also special in classical 
mechanics; this is the case in which Newton’s law is a linear differential 
equation. 

Although the expected position and expected momentum do not (in gen- 
eral) exactly follow classical trajectories, they will do so approximately un- 
der certain conditions. If the wave function w(x) is concentrated mostly 
near a single point x = zo, then (V’(X)),, and V'((X),,) will both be 
approximately equal to V’(xo). In that case, the expected position and 
expected momentum of the particle will approrimately follow a classical 
trajectory, at least for as long as the wave function remains concentrated 
near a single point. 


3.8 The Heisenberg Picture 


The “Heisenberg picture” of quantum mechanics is based on Heisenberg’s 
matrix model of quantum mechanics (Sect. 1.3). In the Heisenberg picture, 
one thinks of the operators (quantum observables) as evolving in time, while 
the vectors in the Hilbert space (quantum states) remain independent of 
time. This is to be contrasted with the approach to quantum mechanics 
we have been using up to now (the “Schrédinger picture”), in which the 
observables are independent of time and the states evolve in time. 


Definition 3.20 In the Heisenberg picture, each self-adjoint operator A 
evolves in time according to the operator-valued differential equation 


dA(t) 1 “ 
—— = —|A(t),H 3.39 
where H is the Hamiltonian operator of the system, and where |-,:] is the 


commutator, given by [A,B] = AB— BA. 


Note that since H commutes with itself, the operator H remains constant 
in time, even in the Heisenberg picture. This observation is the quantum 
counterpart to the fact that the classical Hamiltonian H remains constant 
along a solution of Hamilton’s equations. 
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Given the self-adjoint operator H , the spectral theorem will give us a way 
to construct a family of unitary operators e~“/", ¢ € R, and this family of 
operators computes the time-evolution of states in the Schrédinger picture 
(Sect. 3.7.2). It is easy to check (at least formally) that the solution to 
(3.39) can be expressed as 


A(t) = e@@#/* fe-#8/, (3.40) 


Now, if w is the state of the system (now considered to be independent of 
time), then the expectation of A(t) in the state is defined to be (A(t)),, = 
(w, A(t)w). We may then compute that 


(A(t) y —= (ip, eH /M ge Ht /My \ 
= (e-#Hity, Ae“ HH ty) 
= (v(t), Ad), 


where w(t) is time-evolved state of the system in the Schrédinger picture. 
Here, we have used that the adjoint of e//" is e~*#/", which is formally 
clear and which is a consequence of the spectral theorem. 

Note that in the Schrédinger picture, (¢(t), Aq(t)) is the expectation 
value of A in the state )(t). We conclude, then, that the Heisenberg picture 
and the Schrédinger picture give rise to precisely the same expectation 
values for observables as a function of time, and are therefore physically 
equivalent. Although we will work primarily with the Schrédinger picture of 
quantum mechanics, the Heisenberg picture is also important, for example, 
in quantum field theory. 


Proposition 3.21 Suppose H = P?/(2m)+V(X), where V is a bounded- 
below polynomial. Then for any t € R we have 


= (P()?+V(X(). (3.41) 
2m 

Note that since [H, H] = 0, the Hamiltonian HZ is independent of time, 
even in the Heisenberg picture. Thus, the right-hand side of (3.41) is ac- 
tually independent of t, even though P(t) and X(t) depend on t. Equa- 
tion (3.41) holds also for sufficiently nice nonpolynomial functions V, but 
some limiting argument would be required in the proof. The assumption 
that V be bounded below is to ensure that H is actually an (essentially) 

self-adjoint operator; compare Sect. 9.10. 


Lemma 3.22 Suppose A is a self-adjoint operator on H and that A(-) is 
a solution to (3.39) with A(0) = A. Then for any positive integer m, the 
map 


t+ (A(t))™ 
is also a solution to (3.39). 


80 3. A First Approach to Quantum Mechanics 


That is to say, the time-evolution of the mth power of A is the same as 
the mth power of the time-evolution of A; that is, A(t) = (A(t))”. 
Proof. If we use (3.40), then the result holds because 


cit /K ym o-itH/h _ itl /h 4 o—itHl /h ith /h 4-H /h || pitt /h 4 o—itH/h 
iad oF m 
-_ (HAM Ae HHI) 


It is also easy to check that A(t)” satisfies the differential equation (3.39). 
: 

With this lemma in hand, it is easy to prove the proposition. 
Proof of Proposition 3.21. On the one hand, since [H,H] = 0, the 
time-evolved operator H(t) is simply equal to H. On the other hand, if we 
time-evolve P?/(2m) + V(X) using Lemma 3.22, we obtain the expression 
on the right-hand side of (3.41). 


Proposition 3.23 Suppose the Hamiltonian of a quantum system is as 
in Proposition 3.21. Then the operators X(t) and P(t) defined by (3.39) 
satisfy the following operator-valued differential equation: 


dX 1 

am ©) 

dP ; 

ae = TVX). (3.42) 


Proof. See Exercise 7. ™ 

Proposition 3.23 means that the operator-valued functions X(t) and P(t) 
satisfy the operator analogs of the classical equations of motion dx/dt = 
p(t)/m and dp/dt = —V'(x(t)). Nevertheless, the expectation values of X(t) 
and P(t) do not satisfy the ordinary equations of motion, as we have already 
seen by calculating in the Schrodinger picture. If we take expectation values 
in the system (3.42), we get the same answer as in Proposition 3.19, namely, 


d 


ge Oe = MPO), 
d 


HP Oe =—- VAM) y- 


1 
m 


These are not the classical equations of motion, unless the expectation value 
of the operator V’(X(t)) coincides with V’ applied to the expectation value 
of X(t), which is usually not the case. 


3.9 Example: A Particle in a Box 


Let us consider quantum mechanics in one space dimension for a particle 
that is confined to move in a “box,” which we describe as the interval 
0 <a < L. Our goal is to find all of the eigenvectors and eigenvalues of 
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the Schrédinger operator, that is, to find solutions of the time-independent 
Schrodinger equation H w = Ew. In solving this equation, we may think of 
the constraint to the box as follows. Imagine a particle moving in R! in the 
presence of a potential V that is 0 for « between 0 and LE and takes some 
very large constant value C' on the rest of the real line. Classically, this 
would mean that the particle has to have very high energy (greater than 
C) to escape from the box. Quantum mechanically, if we have a solution 
of the time-independent Schrédinger equation H yw = Ew for this potential 
(with E < C), then we expect ~ to decay rapidly for x outside of the box. 
(We will see this behavior explicitly in Chap. 5.) In the limit as C tends to 
infinity, we expect solutions of the time-independent Schrodinger equation 
to be zero outside the box and to tend to zero as we approach the ends of 
the box. 

The upshot of this discussion is that we are looking for smooth functions 
w on (0, L] that satisfy the differential equation 


h2 aw 
— ——_ = <a< : 
oH ae Ev(x), O<a<L (3.43) 
and the boundary conditions 
(0) = o(L) = 0. (3.44) 


For E > 0, the solution space to (3.43) will be the span of two complex 
exponentials, or equivalently a sine and a cosine function: 


v(x) = asin ( 7) + bcos (4) : (3.45) 


If we now impose the boundary condition (0) = 0, we get that b = 0, 
leaving only the sine term. If we then impose the condition ~(L) = 0, we 
will obtain a = 0—which would mean that w is identically zero—unless 


sin (4) =0. (3.46) 











h 


Since we are interested in solutions to (3.43) where w is not identically 
zero, we want (3.46) to hold. Thus, the argument of sine function must be 
an integer multiple of 7. This condition imposes a restriction on the value 
of EF, namely that E should be of the form 

2242 
yemh 
EL; = —, 3.47 
7 2nL? ( ) 
for some positive integer j. 

It is a simple exercise (Exercise 8) to verify that for E < 0, the only 
solution to (3.43) satisfying the boundary conditions (3.44) is the one with 
w identically zero. 
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Proposition 3.24 The following functions are solutions to (3.43) 
satisfying the boundary conditions (3.44): 


5 
j(z) = 4/5 sin (=), ae 


and the corresponding eigenvalues Ej are given by (3.47). The functions 
w,; form an orthonormal basis for the Hilbert space L?((0, L}). 


Proof. We have already verified the equation and eigenvalue for each 7;. 
It is a simple computation to verify that the w,’s are orthonormal, and the 
elementary theory of Fourier series (Fourier sine series, in this case) shows 
that the w,;’s form an orthonormal basis for L?({0,Z]). = 

The Hamiltonian operator for this problem (in which V = 0 inside the 
box) is given by 


This operator is an unbounded operator and is not defined on the whole 
Hilbert space L?((0, L]), but only on a dense subspace Dom(H) C L?((0, Z]). 
The domain of H should be chosen in such a way that H is essentially self- 
adjoint and, thus, symmetric (Sect. 3.2), meaning that 


(6, Hb) = (i6,w) (3.48) 


for all ¢,W in Dom(#). For (3.48) to hold, ¢ and w must satisfy appro- 
priate boundary conditions, which will allow the boundary terms in the 
integration by parts to be zero. (See Exercise 9.) 

Mathematically, then, it is necessary to impose some boundary condi- 
tions in order for H to be an essentially self-adjoint operator. The particular 
choice of boundary conditions (3.44) is based on the idea of approximating 
the box by a very large “confining” potential outside the box. See Chap. 9 
for an extensive discussion of domain issues for unbounded operator. 


3.10 Quantum Mechanics for a Particle in R” 


Up to this point, we have been considering a quantum particle moving 
in R!. It is straightforward, however, to generalize to a quantum particle 
moving in R". The Hilbert space for a particle in R” is L?(IR”), rather than 
L?(R). Instead of single position operator, we have n such operators, given 
by 

XjU(x) = 27x), Ya rare 


Similarly, we have n momentum operators, given by 
Ow 
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As in the R! case, X; does not commute with P; but satisfies [X;, P;] = 
ihI. On the other hand, X; commutes with X; and P; commutes with Pr. 
Furthermore, X; commutes with P, for 7 # k. These formulas are referred 
to as the canonical commutation relations. 


Proposition 3.25 (Canonical Commutation Relations) The position 
and momentum operators satisfy 


1 
iplti Xe =0 

1 

ace Pr] =0 

1 

Glia Pel = Sint (3.49) 


for alll < j,k <n. 


These relations are the quantum counterparts of the Poisson bracket rela- 
tions among the position and momentum functions in classical mechanics. 
Specifically, the role of the Poisson bracket in Proposition 2.24 is played in 
Proposition 3.25 by the quantity (1/(ih))[-,-]. 

If the classical Hamiltonian for a particle in R” is of the usual form 
(kinetic energy plus potential energy), then we may analogously define the 
Hamiltonian operator to be of the form 


H= 3 2 V(X), (3.50) 


where V(X) denotes the result of applying the function V to the commuting 
family of operators X = (X,,...,X,). It it natural to identify V(X) with 
the operator of multiplication by the function V(x). In that case, we may 
write H more explicitly as 


A(x) = — Ave) + VEU), 


where A is the Laplacian, given by 


We refer to an operator of the form (3.50) as a Schrédinger operator. 
We may also introduce angular momentum operators defined by analogy 
to the classical angular momentum functions. 


Definition 3.26 For each pair (j,k) with 1 < j,k <n, define the angular 
momentum operator J;;, by the formula 


Tix = X; Pr = X,P;. 
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As in the classical case, we have jx = 0 when j =k. When j # k, X; 
and P, commute, so the order of the factors in the definition of Jj, is not 
important. Explicitly, we have 


* ; 0 (7) 


The operator in parentheses is the angular derivative (0/00) in the (xj, xx) 
plane. 

When n = 38, it is customary to use the quantum counterpart of the 
classical angular momentum vector, namely, 





Ti = X2P3 — X3P2; Je = X3P, — X1P3; J = X, Py — XoP\. (3.51) 


When n = 3, every Jik with 7 # k is one of the above three operators or 
the negative thereof. 


3.11 Systems of Multiple Particles 


Suppose now we have a system of N quantum particles moving in R”. If the 
particles are all of different types (e.g., one electron and one proton), then 
the Hilbert space for this system is L?(R"”). That is, the wave function 
w of the system is a function of variables x!,x?,...,x%, with each x/ 
belonging to R”. If we normalize w to be a unit vector in L?(R"Y), then 
\u(x!,x?,...,x"Y)|? is to be interpreted as the joint probability distribution 
for the positions of the N particles. 

We may introduce position operators Xj (the kth component of the 
position of the jth particle) and momentum operators Pi in obvious anal- 
ogy to the definition for a single particle. The typical Hamiltonian operator 
for such a system is then 


ry 1 “ cnn N 1 N 
Ay(x',... ~~ oon wavy V4 Vel, oe oe), 


where m, is the mass of the jth particle. Here A; means the Laplacian 
with respect to the variable xi € R”, with the other variables fixed. 

As we will see in Chap. 19, the Hilbert space for a composite system, 
made up of various subsystems, is typically taken to be the (Hilbert) tensor 
product of the individual Hilbert spaces. In the present context, we may 
think of our system of being made up of N subsystems, each being one of the 
individual particles. Fortunately, there is a natural isomorphism (Proposi- 
tion 19.12) between L?(R"”) and the tensor product of N copies of R", 
so that the approach we are taking here is consistent with the general 
philosophy. 
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If the particles in question are identical (say, all electrons), then there 
is an additional complication to the description of the Hilbert space for 
the system. In standard quantum theory, we are supposed to believe that 
“identical particles are indistinguishable.” What this means is that the wave 
function should have the property that if we interchange, say, x! with x?, 
then the new wave function should represent the same physical state as 
the original wave function. Recalling that two unit vectors in the quantum 
Hilbert space represent the same physical state if and only if they differ by 
a constant of absolute value 1, this means we should have 


3 


ab(x2,x1,x?,...,x%) = u(x, x?,x?,..., x) 


for some constant u with |u| = 1. Applying this rule twice gives that wW is 
u’w, so evidently u must be either 1 or —1. 

Particles in quantum mechanics are grouped into two types, according 
to whether the constant u in the previous paragraph is 1 or —1. Particles 
with u = 1 are called bosons and particles with u = —1 are called fermions. 
Whether a particle is a boson or a fermion is determined by the spin of the 
particle, a concept that we have not yet introduced. Nevertheless, we can 
say that particles without spin are bosons. For a collection of N identical 
spinless particles moving in R®, the proper Hilbert space is the symmetric 
subspace of L?(R°%), that is, the space of functions in L?(R°) that are 
invariant under arbitrary permutations of the variables. We will have more 
to say about spin and systems of identical particles in Chaps. 17 and 19. 


3.12 Physics Notation 


In quantum mechanics, physicists almost invariably use the Dirac nota- 
tion (or bra-ket notation) introduced by Dirac in 1939 [5]. This notation 
is made up of Notations 3.27-3.29 below. In this section, we explore the 
Dirac notation along with a few other notational differences between the 
mathematics and physics literature. 

Before proceeding it is important to point out that when using Dirac 
notation, it is essential that the complex conjugate in the inner product 
should go on the first factor. 


Notation 3.27 A vector w in H is referred to as a ket and is denoted 
|b). A continuous linear functional on H is called a bra. For any ¢ € H, 
let (b| denote the bra given by 


(9| (b) = (¢, ) . 


That is to say, (@| is the “inner product with 6” functional. The bracket 
(or bra-ket) of two vectors 6, € H is the result of applying the bra (¢| to 
the ket |w) , namely the inner product of the @ and w, denoted (¢|w) . 
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If A is an operator on H and ¢ is a vector in H, then we can form 
the linear functional (¢| A, i.e., the linear map wv +> (¢|AW). Physicists 
generally write an expression of this form as 


(o|Al Y) . 


This notation emphasizes that there are two different ways of thinking of 
this quantity. We may think of (¢|A|w) either as the linear functional 
(d| A applied to the vector |w), or as the linear functional (¢| applied to 
the vector A |i). 


Notation 3.28 For any ¢ and w in H, the expression |b)(y| denotes the 
linear operator on H. given by 


(loXv1) (x) = leXeblx) = (1x) 1) - 
That is, in mathematics notation, |b)(w| is the operator sending x to (uw, x) ¢. 


The operator |@)(q| associates to each (ket) vector |y) a new vector in 
the only way that makes notational sense: We interpret |¢)(w||x) as the 
vector |¢) multiplied by the scalar (¢)|x) . 


Notation 3.29 Given a family of vectors in H labeled by, say, three indices 
n, 1, and m, rather than denoting these vectors as |\tn.tm), @ physicist will 
denote them simply as |n,l,m) . 


This notation is not without its pitfalls. If we have two different sets 
of vectors labeled by the same set of indices, a mathematician can simply 
label them as @p1.m and Wn1.m, but the physicist has a problem. 

As an example of the Dirac notation, suppose that an operator H has 
an orthonormal basis of eigenvectors w,. A physicist would express the 
decomposition of a general vector in terms of this basis as 


T= 5 |nXn\, (3.52) 


where w,, is represented simply as |n) and where |n)(n| is (given that |n) is 
a unit vector) the orthogonal projection onto the one-dimensional subspace 
spanned by the vector |n) . 


Notation 3.30 In the physics literature, the complex conjugate of a com- 
plex number z is denoted as z*, rather than Z, as in the mathematics liter- 
ature. What a mathematician calls the adjoint of an operator and denotes 
by A*, a physicist calls the Hermitian conjugate of A and denotes by At. 
Physicists refer to self-adjoint operators as Hermitian. 


We may express the concept of an adjoint (or Hermitian conjugate) of 
an operator using Dirac notation, as follows. If A is a bounded operator on 
H, then A! is the unique bounded operator such that 


(| A = (ATy. 
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One peculiarity of the physics literature on quantum mechanics is a 
conspicuous failure of most articles to state what the Hilbert space is. 
Rather than starting by defining the Hilbert space in which they are work- 
ing, physicists generally start by writing down the commutation relations 
that hold among various operators on the space. Thus, for example, a physi- 
cist might begin with position and momentum operators X and P, satis- 
fying [X, P] = ihI, without ever specifying what space these operators are 
operating on. The justification for this omission is, presumably, the Stone— 
von Neumann theorem, which asserts that (provided the operators satisfy 
the expected “exponentiated” relations) there is, up to unitary equiva- 
lence, only one Hilbert space with operators satisfying these relations and 
on which the operators act irreducibly. (See Chap. 14 for a precise state- 
ment of the result.) It is, nevertheless, disconcerting for a mathematician to 
encounter an entire paper full of computations involving certain operators, 
without any specification of what space these operators are operating on, 
let alone how the operators act on the space. 

This practice among physicists represents something of a role reversal. 
In the setting of linear algebra, for example, a mathematician might say, 
“Let V be a n-dimensional vector space over R.” If a physicist says, “Oh, so 
it’s R”,” the mathematician will reply, “No, no, you don’t have to choose a 
basis.” By contrast, in quantum mechanics, it is the physicist who does not 
want to choose a particular realization of the space. A physicist will simply 
write down the commutation relations between, say, X and P. If pressed, 
the physicist might say that he is working in an irreducible representation 
of those relations. If a mathematician then says, “Oh, so it’s L?(R),” the 
physicist will reply, “No, no, there is no preferred realization.” 


Notation 3.31 Given an irreducible representation of the canonical com- 
mutation relations, and given a vector w in the corresponding Hilbert space, 
a physicist will speak of the position wave function w(x), defined by 


v(x) = (|p). (3.53) 


Here, (x| is the bra associated with the ket |x), where |x) is supposed to be 
an eigenvector for the position operator with eigenvalue x. 


See, again, Chap. 14 for the precise notion of “irreducible representa- 
tion of the canonical commutation relations.” One may similarly define the 
momentum wave function by taking the inner product of w with the eigen- 
vectors of the momentum operator, which are also non-normalizable. See 
Sect. 6.6 for details. 

A mathematician might find Notation 3.31 objectionable on the grounds 
that the operator X does not actually have any eigenvectors. After all, 
it is harmless, in view of the Stone-von Neumann theorem, to work in 
the “Schrédinger representation,” in which our Hilbert space is L?(IR) and 
the position operator X is just multiplication by x. Given a number 20, 
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there is no nonzero element 7 of L?(R) for which Xq = xoy. After all, 
any w satisfying this equation would have to be supported at the point 
XZ = Xo, in which case w would equal zero almost everywhere and would be 
the zero element of L?(R). A physicist, on the other hand, would say that 
the desired eigenfunction is w(x) = 6(a@ — a), where 6 is the Dirac delta- 
“function.” The fact that 6(a% — ao) is not actually in the Hilbert space 
L?(R) does not concern the physicist; it is simply a “non-normalizable 
state.” The mathematical theory of such non-normalizable states comes 
under the heading “generalized eigenvectors.” See Sect. 6.6 for a discussion 
of this issue in the case of the eigenvectors of the momentum operator. 

A more subtle issue regarding the “position eigenvectors” is that each 
eigenvector is unique only up to multiplication by a constant. If one wants 
the momentum operator to act on the position wave function, as defined by 
(3.53), in the usual way, one must make a consistent choice of normalization 
of the eigenvectors of the position operators. Specifically, one should choose 
the constants in such a way that the exponentiated momentum operator 
exp(iaP/h) maps |x) to |a +a). 


3.13 Exercises 


1. Suppose that (t) and ¢)(t) are differentiable functions with values in 
a Hilbert space H, meaning that the limit 


dp _,.. o(t +h) — ot) 
Fe h 


exists in the norm topology of H for each ¢, and similarly for w(t). 
Show that 


& (040). 0(0) = (Fu) + (00.2). 


2. Suppose A and B are operators on a finite-dimensional Hilbert space 
and suppose that AB — BA = cl for some constant c. Show that 
c=0. 


Note: This shows that the commutation relations in (3.8) are a purely 
infinite-dimensional phenomenon. 


3. If A is a bounded operator on a Hilbert space H, then there exists a 
unique bounded operator A* on H satisfying (¢, AW) = (A*¢, w) for 
all ¢ and w in H. (Appendix A.4.3.) The operator A* is called the 
adjoint of A, and A is called self-adjoint if A* = A. 


(a) Show that for any bounded operator A and constant c € C, we 
have (cA)* = GA*, where € is the complex conjugate of c. 


8. 
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(b) Show that if A and B are self-adjoint, then the operator 
1 
—|A,B 
ql Bl 


is also self-adjoint. 


. Verify Proposition 3.19 using Proposition 3.14. Note that the operator 


V'(X) means simply the operator of multiplication by the function 
V'(a). 


. Suppose that w is a unit vector in L?(R) such that the functions 


xy(x) and x?¢(a) also belong to L?(R). Show that 
F 2 
CEN > ey). 
Hint: Consider the integral 


[eo Wa ae, 


where a = (X),,. 
. Consider the Hamiltonian H for a quantum harmonic oscillator, given 
by 
s nh? a k 
f= + =2° 


"2m daz" 2”? 


where &; is the spring constant of the oscillator. Show that the function 


is an eigenvector for H with eigenvalue fiw/2, where w := \/k/m is 
the classical frequency of the oscillator. 


Note: We will explore the eigenvectors and eigenvalues of H in detail 
in Chap. 11. 


. Prove Proposition 3.23. 


Hint: Show that [P(t), H] = ([P, H])(t) and [X(t), H] = ([X, H])(t). 


(a) Find the general solution to (3.43), where F is a negative real 
number. Show that the only such solution that satisfies the 
boundary conditions (3.44) is identically zero. 


(b) Establish the same result as in Part (a) for E = 0. 


90 


9. 


3. A First Approach to Quantum Mechanics 


(a) Suppose ¢ and w are smooth functions on [0, L] satisfying the 
boundary conditions (3.44). Using integration by parts, show 


that ; 
(4, Ab) = (6,0), 
where H = —(h?/2m) d?/da? and where 
L 
=] (a) dx. 
(ou) = [ @w(e) ae 


(b) Show that the result of Part (a) fails if @ and Ww are arbitrary 
smooth functions (not satisfying the boundary conditions). 


10. Let Ti. dus and Js be the angular momentum operators for a particle 


moving in R®. Using the canonical commutation relations (Proposi- 
tion 3.25), show that these operators satisfy the commutation rela- 
tions 
Sade Hie. a 
ih 1,72] — 43; ih 2,73] — 41) iF 3,71} — Y2- 
This is the quantum mechanical counterpart to Exercise 19 in the 
previous chapter. 


4 
The Free Schrodinger Equation 


In this chapter, we consider various methods of solving the free Schrodinger 
equation in one space dimension. Here “free” means that there is no force 
acting on the particle, so that we may take the potential V to be identically 
zero. Thus, the free Schrédinger equation is 


Ob — ih Oy 


Bt > 2m Da?’ (4.1) 


subject to an initial condition of the form 


(2,0) = Yo(2). 


We will identify some key features of solutions to this equation, such as the 
“spread of the wave packet” and the distinction between “phase velocity” 
and “group velocity.” In particular, the notion of group velocity will confirm 
our expectation that a particle of momentum p should travel with velocity 
v= p/m. 

Before attempting to solve the free Schrédinger equation, let us make a 
simple observation about the time evolution of the expected values of the 
position and momentum. If we apply Proposition 3.19 in the case that V 
is identically equal to zero, we have 


d I! 
a ow = Plow 
d 
rr (P) ya) = 9. 
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Thus, the expectation value of P is independent of time, which then means 
that the expectation value of X is linear in time: 


Thus, the free Schrodinger equation is one of the special cases in which 
the expected values of the position and momentum exactly follow classical 


trajectories (and those classical trajectories are very simple in the case 
V=0). 


4.1 Solution by Means of the Fourier Transform 


We look for solutions of the free Schrédinger equation on R! of the form 
(a, t) = eller“), (4.2) 


where k is the frequency in space and w(k) is the frequency in time, which 
is an as-yet-undetermined function of k. (Of course, such a solution is not 
square-integrable in x for a fixed t, but we will find our way back to square- 
integrable solutions eventually.) Plugging this into (4.1) easily gives the 
formula for w as a function of k: 


ik? 


w(k) = —. (4.3) 
2m 

A formula of this sort, expressing the temporal frequency w as a function of 

the spatial frequency k in a solution of some partial differential equation, 

is called a dispersion relation. 


Observe that (4.2) can be written as 


w(a, t) = exp it (: a =.) | (4.4) 


Now, replacing a function f(a”) by f(a — a) has the effect of shifting f to 
the right by a. Thus, the time-evolution has the effect of shifting the initial 
function to the right by an amount equal to (w(k)/k)t. This means that 
the function w(x, t) is moving to the right with speed w(k)/k. This speed, 
for reasons that will be clearer in Sect. 4.3, is called the phase velocity. 

The phase velocity, then, is the speed at which a pure exponential solution 
of our equation (the free Schrédinger equation) propagates. We compute 
the phase velocity as w(k)/k = hk/(2m). Now, we have said that a wave 
function of the form e’** represents a particle with momentum p = hk. 
We thus arrive at the following curious conclusion. 
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Proposition 4.1 The phase velocity of a particle with momentum p = hk is 


k hk 
phase velocity = ou =3. 7 7 : 
m m 





This velocity is half the velocity of a classical particle of momentum p. 


Proposition 4.1 might make us think that our basic relation p = hk is 
off by a factor of 2. We will see, however, that the phase velocity, that is, 
the velocity of a pure exponential solution, is not the “real” velocity of a 
particle with momentum p. The real velocity is the “group velocity,” which 
will turn out to be, as expected, p/m. 

Leaving aside for now the question of the velocity, let us build up a 
general solution to (4.1) from solutions of the form (4.2). We make use of 
the Fourier transform, discussed in Appendix A.3. We can then express the 
solution to the free Schrédinger equation, for “nice” initial conditions, as a 
“superposition” of these pure exponential solutions. 


Proposition 4.2 Suppose that yo is a “nice” function, for example, a 
Schwartz function (Definition A.15). Let wo denote the Fourier transform 
of wo and define w(a,t) by 


W(a,t) = = / 7 Wo(ket—H)*) dk, (4.5) 


or 
where w(k) is defined by (4.3). Then (x,t) solves the free Schrodinger 


equation with initial condition wWo. 


The assumption that w be a Schwartz function is stronger than neces- 
sary. The reader is invited to trace through the argument and find suitable 
weaker conditions. 

Proof. Since the Fourier transform of a Schwartz function is a Schwartz 
function, q9(k) will decay faster than 1/k* as k tends to +00. Meanwhile, 
by integrating the derivative of the function e’**, we obtain the estimate 





eth(ath) _ pikax 


€ 
< |k]. 
| <i 








We can then apply dominated convergence, using |k| jot) as our domi- 


nating function, to move a derivative with respect to x under the integral 
sign in the formula for w(a,t). This derivative pulls down a factor of ik 
inside the integral. The decay of wo allows us to repeat this argument to 
move a second derivative with respect inside the integral. We can also move 
a derivative with respect to t inside the integral, by a similar argument. 
Since exp{i(kx — w(k)t)} satisfies the Schrédinger equation for each 
fixed k, differentiation under the integral shows that w(a,t) satisfies the 
Schrodinger equation as well. The Fourier inversion formula shows that 


w(x,0) = wWo(x). m 
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Proposition 4.3 If (x,t) is as in Proposition 4.2, then the Fourier 
transform of w(x,t), with respect to x with t fixed, is given by 





(edie |i (4.6) 


Proof. We can write (4.5) as 


w(x, t) = af ar [tbo (e204 dk. 


By the uniqueness of the Fourier decomposition (i.e., the injectivity of the 
inverse Fourier transform, which follows from the Plancherel formula), the 
Fourier transform of 7(x,t) (with respect to 2) must be the function in 
square brackets. Putting in the expression (4.3) for w(k) establishes the 
desired result. @ 

Now, the Fourier transform is a unitary map from L?(R) onto L?(R). 
Thus, for any wo in L?(R), wo also belongs to L?(R). Since the quantity 
multiplying qo(k) in (4.6) has absolute value 1, the right-hand side of (4.6) 
is a well-defined square-integrable function of k, for any Wo in L?(IR), which 
has a well-defined inverse Fourier transform in L?(R). 


Definition 4.4 For any wo € L?(R), define, for each t € R, w(x, t) to be 
the unique element of L?(R) that has a Fourier transform (with respect to 


x) given by (4.6). 


Definition 4.4 defines a time-evolution for arbitrary initial conditions 
in L?(R). For general wo € L?(IR), however, (x,t) may not satisfy the 
Schrédinger equation in the classical, pointwise sense, simply because 4)(z, t) 
may fail to be differentiable, either in x or in t. Nevertheless, w(a,t), as 
defined by Definition 4.4, always satisfies the Schrédinger equation in the 
weak (distributional) sense. See Exercise 1. 


4.2 Solution as a Convolution 


According to Proposition 4.3, we see that the Fourier transform of the 
time-t wave function is the product of the Fourier transform of Wo and 
the function exp[—ithk?/(2m)]. According to Proposition A.21, the inverse 
Fourier transform of a product of two sufficiently nice functions is 1//W/27 
times the convolution of the two separate inverse Fourier transforms. Here 
the convolution ¢ *« w of two functions ¢ and w is defined to be 


(+e) = f * deena ae 


whenever the integral is convergent for all x. 
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Formally, then, we ought to have 


w(a, t) = Wo * Ky, (4.7) 


tf acy kt 
Ky= rae {exp -i om | \ : 

The problem with is idea is that the function exp[—ithk?/(2m)] is not 
a “nice” function in the usual sense. Certainly, this function is not the 
Fourier transform of some function in L'(R)M L?(R), because if it were, 
then the function would have to tend to zero at infinity (Proposition A.14). 
Therefore, we cannot directly apply Proposition A.21, even if wo is in 
D1 (R) NL? (R). 

Fortunately, the desired inverse Fourier transform can be computed as a 
convergent improper integral (Exercise 2), with the following result: 


I of? 3 hk?t m ma? 
K,(«) = — ee i =4/ aa fC 
(2) 20 [. ss | “Om | as i2rht nee {i 2th \ ve) 


Here, the square root is the one with positive real part. The function K, 
is called the fundamental solution of the free Schrédinger equation. (See 
Fig. 4.1.) This function does indeed satisfy the free Schrédinger equation, 
as we can easily verify by direct differentiation. 

The preceding discussion should make the following result plausible. 


where 








Theorem 4.5 Suppose wo € L?(R)NL1(R). Then (x,t), as defined by 
(4.5), may be computed for allt £0 as 


wot) = fs [exp {ik (wy)? vol) dy 


The expression for w(a,t) is (20)~\/?K, * Wo, where K, is as in (4.8). 


Proof. For any set FE C R, let 1g denote the indicator function of F, that 
is, the function that is 1 on EF and 0 elsewhere. Then Ky1;_7 nj belongs to 
D1 (IR) 9 L?(R) for any positive integer n. By Proposition A.21, then, we 
have 


F ((Kiltenyn]) * Yo) = V20F(Kil_n,nj)F (do). (4.9) 


Because Wo is in L1(R), it is easy to see that Kilj_-nnj * Yo converges 
pointwise to K;* wo. On the other hand, using the argument in Exercise 2, 
we can see that F(Ky1;_y,,)) is bounded by a constant independent of n 
and converges pointwise to the function 


(4.10) 





1 | ~ 
ex 7 
VJ2r . 2m 
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FIGURE 4.1. The real part of Ky (x 
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top) and t = 0.2 (bottom). 
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Equation (4.10) is enough to show that the right-hand side of (4.9) 
converges in L?(IR) to the function 





By the Plancherel theorem, Ky1,~n,n)*%o must also be converging in L?(R), 
and the L? limit must coincide with the pointwise limit, which is K; * Wo. 
Thus, taking limits on both sides of (4.9) shows that the Fourier transform 
of Ky * Wo is what we want it to be. m 

In general, to be considered the fundamental solution of a certain equa- 
tion, a function should converge to a Dirac 6-function (Example A.26), in 
the distribution sense, as ¢ tends to zero. Since |K;(x)| is independent of 
x for each t, it might seem doubtful that A; has this property. On the 
other hand, we can see K;() oscillates very rapidly except near x = 0. 
(See Fig. 4.1.) This oscillation causes the integral of K,(x) against some 
nice function (a) to be small, except for the part of the integral near 
x = 0. Indeed, because the Fourier transform of K; converges to the con- 
stant function 1/./2m (which is what we get by formally taking the Fourier 
transform of the 6-function) as ¢ tends to zero, it is not hard to show that 
K, does, in fact, converge to a 6-function. The details of this verification 
are left to the reader. 


4.3 Propagation of the Wave Packet: First 
Approach 


Let us consider the Schrédinger equation in R! with an initial condition 
wo that is a “wave packet,” meaning a complex exponential multiplied by 
some function that localizes Wo in space. Specifically, we take 


wo(ax) = e'?0*/* Ay (x), (4.11) 


where Ao is some real, positive function and po is a nonzero real number. 
(The case po = 0 should be treated separately.) We also assume that Ao is 
“slowly varying” compared to e%?0*/", meaning that Ao is approximately 
constant over many periods of the function e’?°*/", (We will give a more 
precise meaning to the “slowly varying” condition shortly.) Thus, if we look 
at Wo(x) on a distance scale of a small number of periods of the function 
e'Por/h then wo will look like a constant times e*?°*/”, which, as we have 
seen, represents a particle with momentum po. We expect, then, that the 
wave function Wo represents a particle with momentum approximately equal 
to Po- 

Let us now try to solve the free Schrodinger equation in terms of the 
amplitude and phase of the wave function. We write 


O(a, t) = A(a, te 
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where A and @ are real-valued functions. If we plug this expression for w 





into the free Schrédinger equation and then cancel a factor of e!#(*) from 
every term, we obtain the equation 
JA 00, ihO?A hOADO ih ,(00\” hh ,0?0 
j = A . (4.12 
ot a Ot 2m 0x? m Ox Ox =m (x) 2m Ox? ey) 


Since A and @ are real-valued, we may separately equate the real and 
imaginary parts of (4.12), giving 
OA h OA 00 h ,0?6 


at =o m Ox Ox om da ee) 





and (after dividing the imaginary part of (4.12) by A) 





dt mA Ar %m \ da 4) 


00 hR1PA fh (2) 
Any solution to this system of partial differential equations will yield a 
solution (x,t) = A(z, t)e) to the free Schrédinger equation. 

Since we are assuming A is “slowly varying” compared to 0, it is reason- 
able to think that the first term on the right-hand side of (4.14) will be 
small compared to the second term. That is to say, we interpret the slowly 
varying condition to mean 


10°?A a0 \” 
pass wae 4.15 

Adan * (=) et0) 
where the symbol < means “much smaller than.” We will take initial con- 
ditions such that (4.15) holds at t = 0, and then we will assume that (4.15) 
continues to hold at least for small positive times. We may then (to first 


approximation) drop the first term on the right-hand side of (4.14), giving 
the following simplified version of (4.14): 


a0 nh (00\? 


We now look for a solution to the pair of equations (4.13) and (4.16) 
with initial conditions corresponding to (4.11). 


Proposition 4.6 A solution to the approximate equations (4.13) and 
(4.16) with initial condition 0(x,0) = pox/h is given by 


6(2,t) = 2 (x = +) (4.17) 


~ A(z, t) = Ao (x = 71) (4.18) 
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This yields an approximate solution to the free Schrodinger equation 
given by 
w(a,t) = Ao (x - 74) exp Ge (« - =) : (4.19) 
m A 2m 

Note from (4.17) and (4.18) that if the “slowly varying” condition (4.15) 
holds at time 0, it will continue to hold for all positive times in our approx- 
imate solution. 
Proof. Although (4.16) is a nonlinear equation, we can find a solution to 
it with the simple initial conditions 0(2,0) = pox/h, namely, 


_ Pot _P 
Oe aah 
ey eee 
== (< 4) (4.20) 
Since 00/Ox = po/h and 0760/0x? = 0, if we plug (4.20) back into (4.13) 
we obtain 
a 
Ot mdz 


The (presumably unique) solution to this linear equation with initial con- 
dition A(x,0) = Ap(z) is 


A(x,t) = Ao (x = mt) (4.21) 


as claimed. m 

We hope that the solution (4.19) to the system of equations (4.13) 
and (4.16) is a close approximation to the solution to the original pair of 
equations (4.13) and (4.14)—assuming, of course, that Ao is slowly varying 
compared to 09(x) = pox/h. It is not especially easy to estimate directly 
how rapidly solutions to (4.13) and (4.16) diverge from solutions to (4.13) 
and (4.14). We will therefore leave an estimate of the error in our approxi- 
mation until the next section, where we will obtain the same approximate 
solution by a different method. 

Note that a function of the form f (x,t) = ¢(a—vt) is moving to the right 
with constant velocity v. (If v is negative, then, of course, this means the 
function is moving to the left.) Observe that both the amplitude A(z, t) and 
the phase exp{i0(x, t)} are of this form, but with two different velocities. 


Conclusion 4.7 In the approximate solution (4.19) to the free Schrodinger 
equation, the amplitude A(x,t) is moving with velocity po/m, whereas the 
phase 0(x,t) is moving with velocity po /(2m). These two velocities are called 
the group velocity and the phase velocity, respectively: 


phase velocity = ea 
2m 


. Po 
group velocity = —. 
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Note that the formula for the phase velocity agrees with the one given 
previously in Sect. 4.1, the velocity of propagation of a pure exponential so- 
lution to the free Schrodinger equation. Indeed, nothing prevents us from 
taking Ap = 1, in which case the left-hand side of (4.15) is actually identi- 
cally zero, so that a solution to (4.13) and (4.16) is actually a solution to 
(4.13) and (4.14). 

Which of the velocities is the “real” velocity of the particle? The answer 
is: the group velocity. After all, the probability distribution for the parti- 
cle’s position is determined by the amplitude of the wave function and is 
unaffected by the phase. It is the amplitude that determines (as much as it 
can be determined) where the particle is. Thus, the true velocity of the par- 
ticle should be the velocity at which the amplitude propagates. Figure 4.2 
shows the propagation of the real part of a wave packet, with the motion 
of a single peak indicated by the shaded region. The phase velocity deter- 
mines the speed at which the individual peaks in the real part of ~ move, 
whereas the group velocity determines the speed of the packet as a whole. 
Since the peak we are tracking lags well behind the motion of the whole 
packet, we see that the phase velocity is smaller than the group velocity. 

We should expect that solutions to our approximate equations (4.13) 
and (4.16) will diverge slowly over time from solutions to the free 
Schrédinger equation (4.13) and (4.14). For sufficiently long times, there 
may be a significant difference between approximate and true solutions. 
This expectation is confirmed in Sect. 4.5, where we investigate the spread 
of the wave packet, a phenomenon that is not seen in our approximation. 


4.4 Propagation of the Wave Packet: Second 
Approach 


We have seen that the general solution of the free Schrodinger equation can 
be obtained by means of the Fourier transform as 


ved= se | who(k) exp [i (ka —w(k)t)] dk, (4.22) 


where 12 
w(k) = a 
Let us assume that Wo has approximate momentum equal to po. Thus, we 
expect that wo(k) will be concentrated near ko := po/h. If that is the case, 
then only the values of k close to kg are important. For k close to ko, we 
use the first-order Taylor expansion 


(4.23) 


w(k) = w(ko) + w"(ko)(k — ko), (4.24) 


where for now we do not put in the explicit formula for w’(ko). 
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FIGURE 4.2. Propagation of Re[7)], with motion of a single peak shaded. 
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Inserting (4.24) into (4.22), we get two factors that are independent of k 
and come outside the integral, leaving us with 
1 “j / . me Aa 
x,t) & —=e” ening tote k) exp [ik(a — w'(ko)t)| dk 
(a, t) Tin vole) p [ik( (ko)t)] 


= ele’ (ko) kot ei ho) tah (a — wy! (kg)t). (4.25) 


Note that the factors in front of wo(x — w’(ko)t) are simply constants, 
that is, independent of x. These constants do not affect the “state” of the 
system, in that we have said that two vectors in the quantum Hilbert space 
that differ by a constant represent the same physical state. Ignoring these 
constants, we are left with the factor of ~o(x — w'(ko)t), which is simply 
shifting to the right at speed w’(ko). Thus, the (approximate) velocity at 
which our wave packet is moving is 


hk 
velocity ~ w'(ko) = og = =f 


Let us consider the special case in which Wo is of the form 
to(x) =e Ao(a), 
where Ao is real and positive. Then (4.25) becomes 
elt (ko) kot pial Ka)? ptkola—er (hot) A. (ap _ w! (kot). 


After canceling the terms involving w’(ko)kot in the exponent, we obtain 
w(x, t) re etkor—wlko)t) 44 (a — wy! (ko)t). 


Recalling that po = hk and putting in the formula for w, we see that this 
approximation to w(a,t) is precisely the same as the one we obtained, by 
a different method, in Proposition 4.6. 

As in Sect.4.3, we see that the velocity at which a pure exponential 
solution of the free Schrédinger equation propagates [namely, w(ko)/ko = 
hiko/(2m)] is not the same as the velocity at which the overall wave packet 
propagates. Rather, as seen in (4.25), the wave packet propagates at a 
velocity given by w’(ko) = hiko/m. We may summarize this conclusion in 
the following proposition. 


Proposition 4.8 The speed at which a pure exponential solution of the 
free Schrodinger equation propagates is 


Ww (ko) hko Po 
ko 2m WM 





phase velocity 








By contrast, the (approximate) speed at which the wave packet propagates is 


—— dw hko Po 
group velocity = — =—_—=—. 
AK | 5, ko m m 
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The disadvantage of the method we used in Sect. 4.3 is that it does not 
easily yield estimates on how big an error there is in our approximation. 
In the current section, however, we can estimate the error by comparing 
the Fourier transforms of the exact solution and the approximate solution. 
Our error estimate will involve a quantity « defined as follows: 


R= Lf doth) (k — ko)! an| =, (4.26) 





The quantity « is, roughly, half the width of the interval around ko on 
which most of w(k) is concentrated. If, for example, w is supported in the 
interval [ko — €,ko + ¢], then « < ¢, assuming that ~—and therefore w—is 
a unit vector. (A more common measure of concentration would replace 
(k — ko)* by (k — ko)? and the fourth root of the integral by the square 
root. But the “quartic” measure of concentration in (4.26) is the one that 
arises in estimating the error of our approximations in this section.) 


Proposition 4.9 Let ~(a,t) be the exact solution to the free Schrodinger 
equation with initial condition wo, and let d(x,t) be the approximate solu- 
tion given by the right-hand side of (4.25). Then the following L? estimate 
holds: ; 
|\t| hw 

II2(z, t) — 9(2, t)|| p2¢@) s IM 


where the L? norm is with respect to x with t fired and where w(-) is defined 
by (4.23). 


Equation (4.27) means that the L? norm of the error will be small, pro- 
vided that 





= |t|w(k), (4.27) 


1 

Itl< eG 
If « is much smaller than ko, then 1/w(«) will be much larger than 1/w(ko). 
That means that the timescale on which the true and approximate solutions 
diverge will be long compared to the timescale on which our approximate 
solution is oscillating. 
Proof. Let w(k,t) and (k,t) denote the Fourier transforms of ¢ and 
with respect to x, with ¢ fixed. From (4.22) we can read off that 


wk, t) = eM tabo(k). 


Meanwhile, 4(k, t) is obtained from o(k, t) by replacing w(k) by the right- 
hand side of (4.24). Now, direct calculation shows that 


w(k) — (w(ko) + w'(ko)(k — ko)) = 





h 
k — ko)?. 
7m 0) 
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From this expression and the elementary estimate len — et? | < |0- 4], 
we obtain 
; . t| h * 
et) — 6¢h,0)| < bo)? fbo(a)]. (4.28) 
The estimate (4.27) then follows by the Plancherel theorem and the 
definition of k. ™ 
For a more detailed version of the approach used in this section, see 
Sect. 5.6 of [30]. 


4.5 Spread of the Wave Packet 


We use the uncertainty (Definition 3.13) A,X in the position of the particle 
as a measure of the “width” of (a) as a function of x. At the level of 
approximation considered in the previous two sections, the uncertainty in 
the position of a free particle is independent of time. After all, in the 
approximate solution (4.19), the amplitude of the wave function simply 
shifts to the right at a speed equal to the group velocity, without changing 
shape. A more precise calculation, however, shows that after sufficiently 
long times, the wave packet spreads out in space. (Exercise 7 gives an idea 
of the time scale on which this spread takes place.) 

We can compute the time-evolution of the uncertainty in the particle’s 
position without having to solve the full Schrodinger equation, by using 
Proposition 3.14 from Chap.3. We start by observing that for a free par- 
ticle, our Hamiltonian is simply P?/(2m), which commutes with P. It fol- 
lows that the expected value and uncertainty for the particle’s momentum 
(and, indeed, the entire probability distribution of the momentum) are in- 
dependent of time. Meanwhile, to compute the time-dependence of (X) 
and (X =) , we use Proposition 3.14 along with the commutation relation 
[X, P] = ihI (Proposition 3.8). 


Proposition 4.10 For a wave function w(a,t) evolving according to the 
free Schrédinger equation on R', the expectation values for X and X? evolve 
as follows: 
t 
(X) w(t) = (X)u + a (Phu 


and 
t ? 
(2?) wt = XK )yg + = (XP + PX) gy + GP’) yoy: 


These relations imply the following result: 
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For a unit vector wo in L?(R), the uncertainty Ay, P in the momentum 
cannot be zero, because the uncertainty would be zero only if wo is an 
eigenvector for the momentum operator. But the eigenvectors for P are 
the functions of the form e’**, which are not in L?(R). Thus, the leading 
coefficient in the expression for (Aut) X)? is never zero, and thus Ay) X 
tends to infinity as t tends to infinity. 

Proof. We compute that 


[P?, X] = P?X — PXP+ PXP- XP? 
= P[P,X]+[P,X]P 
= —2inP. 


Thus (as we have already noted in Sect. 3.7.5), 


i P) wit P 
a ag = (£(-2inP)) _ Peo _ {Plo (4.29) 


w(t) a me 


where we have used in the last equality that the expected momentum is 

independent of time. Since the derivative of (X),,.,) is constant, (X) ,, 4) 

itself is a linear function of t, which gives the first result in the proposition. 
Meanwhile, a little algebra shows that 


[P?, X?] = P[P,X]X + [P,X] PX + XP[P,X]+X[X,P]P 





= —2ih(PX + XP), 
and 
P?,PX+XP| = P([P?,X|+([P?,X] P =—4ihP?. 
[ ) | [ | ’} 
Thus 
d i 
dt aa ere = Se Oe ee a (XP + PX) yy) 
and 
a pag ill . 
ap Pati = raarere (a AP +PX] Ye 
2 
Ae (P?) 5) ~ 2 care 


Since the second derivative of (X?) way is independent of t, (X?) a itself 
is a quadratic polynomial in t, the coefficients of which are determined by 
the value of (X),,,, and its first two time-derivatives at t = 0. This leads 
to the second result in the proposition. The last result follows by direct 
calculation. @ 
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4.6 Exercises 


1. A locally integrable function ~(a,t) satisfies the free Schrédinger 
equation in the weak (or distributional) sense if for each smooth com- 
pactly supported function x, we have 


dy ih & 
[ Y(ast) E 4 oe dx dt = 0. (4.30) 


[One obtains (4.30) by assuming Ow/Ot — (ih/2m)0?~)/Ox? is zero, 
integrating against y(z,t), and then formally integrating by parts.] 


(a) Show that if #(x,t) is smooth as a function of x and t then w 
satisfies the free Schrédinger equation in the pointwise sense if 
and only if w satisfies the free Schrodinger equation in the weak 
sense. 

Hint: Proposition A.23 may be useful. 


(b) For any wo € L?(R), define w(x, t) by Definition 4.4. Show that 
w satisfies the free Schrodinger equation in the weak sense. 


First show that the function ~,4 given by 


a a oe 
vt ne k i(kax—w(k)t) dk 
wa(x ) ae wo( Je 


satisfies the free Schrédinger equation in the weak sense, for each A. 


2. (a) Show that for any a € C with Re(a) > 0, 


oS 2 
(/ eo a) 7 [ e429) dor dy 
—oo R2 


= 27a, 


where the integral over R? can be evaluated using polar coordi- 
nates. Conclude that 


i e-® / (24) dg = 27a, (4.31) 


where the square root is the one with positive real part. 
Show that for all A, B > 0 we have 


B x 
| eo 27/20) gn = —% ¢-27/(20) 
A x 


oc 
~" 


B 
7 +f 6-2? /(2a) dx 
A 


A x? 





for any nonzero complex number a. Using this, show that the 
integral in (4.31) is convergent for all nonzero a with Rea > 0, 
provided the integral is interpreted as an improper integral (i.e., 
the limit as A tends to infinity of an integral from —A to A). 
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(c) Now show that the result of Part (a) is valid also for nonzero 
values of a with Rea = 0. 
Hint: Given 6 4 0, show that the (improper) integral from A 
to oo of exp[—x?/(2(a +%8))] is small for large A, uniformly in 
a € (0, 1]. 

(d) Show that 


a i. eike p—ithk? /(2m) dk = m ima” / (2th) 
27 Joo Qniht ; 


where the integral is interpreted as an improper integral and the 
square root is the one with positive real part. 


3. Suppose ¢ is a Schwartz function (Definition A.15) and 7 belongs to 
L?(R). Show that the convolution ¢ * 7 is smooth (infinitely differ- 
entiable). 


4. Consider the heat equation for a function w(x, t), given by 


Oy _ aw 
Ot Ox?’ 


where a is a constant, subject to the initial condition w(x,0) = wWo(z). 


(a) Derive a differential equation for &(k,t), the Fourier transform 
of a solution of the heat equation with respect to x, with t fixed, 
assuming that a(a,t) is a “nice” function of « for each t. Solve 
this equation subject to the initial condition ~(k,0) = Yo(k). 

(b) Obtain an expression for the solution to the heat equation as 
a convolution of Wo with a “fundamental solution” to the heat 
equation. 


Note: As we will discuss in Chap. 20, the heat equation can be thought 
of as a sort of “imaginary time” version of the free Schrédinger 
equation. 


5. Suppose we take an initial condition in the free Schrodinger equation 
with initial phase given by 69(a) = poa/h and initial amplitude given 
by Ao(x), as in (4.11). Suppose also that the initial amplitude is of 


the form 
1 f/x-2%0 ? 
Ag(x) = exp = ( Z ) : 


Note that Ap is centered around the point xp and that the parameter 
L is a measure of the “width” in space of our initial wave packet. 
A function of the form wo(x) = e’?9*/" A(x), with Ag as above, is 
called a Gaussian wave packet. 
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Compute the quantity 





1 1 07Ao 
‘ 4.32 
cgay (ae a) ae 
Ox 
Assuming that f is small compared to [po, show that (4.32) is small, 
except at points where our initial wave packet is very small. 


Note: This shows that our “slowly varying” assumption (4.15) is rea- 
sonable for the case of Gaussian wave packets. 


. The Klein—Gordon equation, a proposed relativistic alternative to the 


Schrodinger equation, is the equation 
1A dy mic’ 
c2 Ot? Ox? he”? 

where m > 0 is the mass of the particle and c is the speed of light. 





(a) Obtain the dispersion relation for the Klein—Gordon equation, 
that is, the expression for w(k) that makes the function exp/i(ka— 
w(k)t] a solution to the Klein—Gordon equation. 

(b) Show that the phase velocity w(k)/k satisfies |w(k)/k| > c, that 
the group velocity dw(k)/dk satisfies |dw/dk| < c, and that 

(phase velocity)(group velocity) = c?. 
Note: Since the Klein—Gordon equation is second order in time, there 
will be two possible values for w(k) for each k, one positive and one 
negative. The results of Part (b) hold for both of the two “branches” 
of w(k). 


. Consider the uncertainty A,X of a wave function 7(t) evolving 


according to the free Schrédinger equation. Show that 
d AwoP 








— (Ay X)| < ——— 4.33 
7 AvoX)| S = (4.33) 
for all t and that 
_ d Ay P 
lim, G (AvwX) = =. 
Note: By comparison, 
d (Phy 


If o(k) is concentrated in a sufficiently small region around a nonzero 
number ko = po/h, then Ay, P will be small compared to (P),,, . In 
that case, by comparing (4.33) to (4.34), we see that the rate at which 
the wave packet spreads out is small compared to the rate at which 
the wave packet moves. 


5) 
A Particle in a Square Well 


5.1 The Time-Independent Schrodinger Equation 


It is difficult to solve the time-dependent Schrodinger equation explicitly, 
even in relatively simple cases. (Even for the free Schrédinger equation, 
we made do in Chap. 4 with solutions that are either approximate or that 
involve an integral that is not explicitly evaluated.) Usually, then, one ana- 
lyzes the time-independent Schr6édinger equation (the eigenvector equation 
for H ) and then attempts to infer something about the time-dependent 
problem from the results. There are a number of problems, including the 
harmonic oscillator and the hydrogen atom, in which the time-independent 
Schrodinger equation can be solved explicitly. 

In this section, we will consider a simple but instructive example, which 
can be solved by elementary methods. We consider the time-independent 
Schrédinger equation in R', with a potential of the form 


-C, -A<a<A 
V(z) = { 0, || SA ; (5.1) 


where A and C are positive constants. The region —A < a < A is the 
“square well” for the potential (Fig. 5.1). 

Let us think first for a moment about the behavior of a classical particle 
in a square well. If we think of V as the limit of a sequence of potentials 
that change linearly from —1 to 0 in a small interval around +1, we may 
expect the following behavior for a particle in a square well. If the energy 
of the particle is negative, then the particle must be in the well. In that 
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FIGURE 5.1. A square well potential. 


case, it will move with constant speed until it hits the edge of the well, 
at which point it will reflect instantaneously off the wall and move with 
the same speed in the opposite direction. If the energy of the particle is 
positive, it will move always in the same direction, with speed equal to one 
constant when it is not in the well and speed equal to a different constant 
when it is in the well. 

In the quantum case, we will be interested mainly in eigenvectors for the 
Schrédinger operator with negative eigenvalues (EL < 0). Of course, on the 
quantum side of things, energy eigenvectors do not change in time, except 
for an overall phase factor. Nevertheless, since the classical particle with 
FE <0 spends the same amount of time in each part of the well, we may 
expect that the quantum particle will have approximately equal probability 
of being found in each part of the well. This expectation will be fulfilled 
for “highly excited states,” such as the one in Fig. 5.7. For the quantum 
particle, however, there is a small but nonzero probability of finding the 
particle outside the well, which is impossible classically. 

Our goal is to study the time-independent Schrédinger equation, that is, 
the eigenvalue equation 


— 57a t+ V(2)v(2) = Ev(), (5.2) 


where both the eigenvalues F and the associated eigenvectors w (or “eigen- 
functions,” in physics terminology) are as yet unknown. As a second-order 
linear ordinary differential equation, this equation always has (for any value 
of £) a two-dimensional solution space. We are, however, looking for solu- 
tions that lie in the quantum Hilbert space L?(IR). We will see there are 
actually only a finitely many E’s, all of them with F < 0, for which (5.2) 
has a nonzero solution in L7(IR). In this case, then, the Schrédinger op- 
erator H has a discrete spectrum below zero and a continuous spectrum 
above zero. 
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5.2 Domain Questions and the Matching 
Conditions 


Before starting to solve (5.2), we must give some heed to the unbounded 
nature of the Hamiltonian operator. The Schrodinger operator 


A nh? a 
cial 2m dx? aed 

on the left-hand side of (5.2) is an unbounded operator, meaning that there 
is no constant C such that ||Hy||_ < C'||q||, where ||-|| is the ZL? norm. On 
the other hand, we want to define H in such a way that it is self-adjoint. 
But according to Corollary 9.9, a self-adjoint operator that is defined on 
the whole Hilbert space must be bounded. 

We conclude, then, that A is not going to be defined on the entire Hilbert 
space L?(R), but only on a dense subspace thereof. In practical terms, 
saying that H is not defined on the whole Hilbert space means simply that 
for many functions 7 in L?(R), the second derivative d?y/dx? does not 
exist, or exists but fails to be in L?. (In our example, the potential V is 
bounded, and so Vw will always be in L? provided that w is in L?.) 

Since the potential V for a square well is bounded, the domain of the 
Hamiltonian H = P?/(2m) + V(X) is the same as the domain of the 
kinetic energy operator P?/(2m) = —(h?/2m)d?/dz?. As we will see in 
Sect. 9.7, the domain of the kinetic energy operator may be described as 
the space of L? functions y for which d?7/dx?, computed in the weak 
or distributional sense (Appendix A.3.3), again belongs to L?(R). This 
condition is equivalent to the statement that there exists some L? function 
@ such that ~ is the second integral of ¢ (for some choice of the constants 
of integration). 

Meanwhile, since our potential is piecewise constant, any solution w 
to (5.2) will be smooth except possibly at the transition points « = +A, 
and both w and y’ will have left and right limits at A and —A. Indeed, on 
each of the intervals (—oo, — A), (—A, A), and (A, 00), any solution to (5.2) 
will be simply a linear combination of (real or complex) exponentials. For 
functions of this sort, it is not hard to see when we are in the domain of H. 





Proposition 5.1 Suppose w is smooth on each of the intervals (—oo, —A), 
(—A, A), and (A,oo). Then w belongs to the domain of H [with potential 
function given by (5.1)] if and only if the (1) w and dw/dx are continuous 
at = +A, and (2) d?y/dx? belongs to L?(R). 





Proof. Suppose first that ~ satisfies the conditions (1) and (2). Then it is 
not hard to see (Exercise 1) that the second derivative of w in the distribu- 
tion sense is simply the function d?7/dx?, computed in the ordinary point- 
wise sense for  # +A. (The second derivative may not exist at x = +A, 
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but we simply leave d?7/dx? undefined at these two points, which form a 
set of measure zero.) Thus, d?~/dx?, computed in the distribution sense, 
is an element of L?(R). 

On the other hand, if either w of #’ has a discontinuity at « = A or at 
x = —A, then (Exercise 1 again) the distributional derivative will contain 
either a multiple of a 6-function of a multiple of the derivative of 6-function 
at one of these points. But neither a 6-function nor the derivative of 6- 
function is a square-integrable function. ™ 

Let us think about what the continuity condition on w and dw/dx means 
in practical terms. Since V is constant on (—oo, —A), we can easily solve 
(5.2) on that interval, obtaining a two-dimensional solution space. Once we 
choose a solution from this solution space, then the values of w and dw)/dx 
as x approaches —A from the left will serve as the initial conditions for solv- 
ing (5.2) on (—A, A). Thus, the requirement of continuity for q and dy)/dx 
serve as a “matching condition” between the solution on (—oo, —A) and the 
solution on (—A, A). We cannot just separately pick any solution to (5.2) 
on (—oo, —A) and any solution on (—A, A); at the boundary, the values of 
w and dw /dx must match. (This same matching condition appears in el- 
ementary treatments of ordinary differential equations with discontinuous 
coefficients. ) 

Once we pick a solution on (—oo,—A) we get a unique solution on 
(—A, A)—and then the values of ~ and dw/dxz as we approach A from 
the left will serve as the initial conditions for solving (5.2) on (A, oo). The 
conclusion is that once we pick a solution to (5.2) on (—oo, —A) (from the 
two-dimensional solution space), we have no additional choices to make; 
the differential equation along with the matching conditions give a unique 
way to extend the solution from (—oo, —A) to the whole real line. 


5.3 Finding Square-Integrable Solutions 


If E > 0, then any solution to (5.2) will be a combination of two complex 
exponentials in the range « < —A; such a function cannot be square- 
integrable unless it is identically zero. If, however, we take w to be iden- 
tically zero in the region x < —A, then our continuity condition requires 
that w and dw/dx approach 0 as x approaches —A from the right. Thus, 
the matching conditions at —A force the solution to be identically zero in 
[—A, A] as well. Finally, by matching across x = A, we get an identically 
zero solution on [A,oo). Thus, for F > 0, any solution to (5.2) satisfy- 
ing the continuity conditions in Proposition 5.1 must be identically zero. 
A similar analysis applies when E = 0, where the solutions to (5.2) on 
(—oo, A] would be of the form c; + ce”, which is square-integrable only if 
cq =Q= 0. 
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The conclusion, then, is that to have a chance to get a solution to (5.2) 
that is square-integrable and in the domain of H, we must take E < 0. For 
E <0, the solution to (5.2) on (—oo, —A) will be a linear combination of 
the two exponentials exp(ax) and exp(—az), where 


ven) (5.3) 


h 


QQ= 


For w to be square-integrable over (—oo, —A), the coefficient of exp(—az) 
must be zero, since this term grows exponentially as x tends to —oo. Thus, 
the value of 7 on (—oo, —A) must be cexp(ax). Once we choose a value 
for c, we get a unique solution on (—A, A) by matching ~ and y”’ across 
x = —A. We then get a unique solution on (A,oo) by matching across 
x = A. The solution on (A,oo) will be again be a linear combination 
of exp(ax) and exp(—az). For w to be in L?, we need the coefficient of 
exp(ax) on (A, co) to be zero. We have no choice, however, about what 1 
is on (A,oo); the coefficient of exp(ax) either comes out to be zero or it 
does not. 

The conclusion, then, is that for any F < 0, there is a unique (up to a con- 
stant) solution to (5.2) that is square-integrable on the interval (—oo, — A). 
This solution then gives rise to a unique solution on (—A, A) and then toa 
unique solution on (A, oo), up to a constant. Unless we are lucky, the solu- 
tion on (A, 00) will grow exponentially and thus fail to be in L?. Therefore, 
in most cases there will be no nonzero solution to (5.2) that satisfies the 
continuity condition and is square-integrable over the whole real line. The 
hope is that for certain special values of E, we will be able to find a solu- 
tion that decays exponentially both on (—oo, —A) and on (A, oo), in which 
case the solution will belong to L?(R). 

It can be shown (Exercise 6) that there are no nonzero square-integrable 
solutions with EF < —C. Therefore, any square-integrable solutions to (5.2) 
that may exist must come from the range —C' < E < 0. To analyze this 
range, let us rewrite the time-independent Schrédinger equation by dividing 
through by —h?/(2m), yielding the equation 








dw ew |x| >A 
ae ‘ (5.4) 
—(c—e)p la) << A 
where 
2nE 
e=— 72 
2mC 
c= (5.5) 


Note that although F is assumed to be negative, we have normalized € to 
be positive; the condition —C < E < 0 corresponds to 0 <e<c. 
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Because our potential function V is even, it is easy to see that for any 
solution ~) to (5.4), the even and odd parts of w are also solutions. We can, 
therefore, analyze even solutions and odd solutions separately. We begin 
with the even case. For 7 < —A, every solution to (5.4) that is square- 
integrable over (—oo, A) is of the form 


w(x) = aeV, 2 <A. (5.6) 
Since we assume that w is even, we then have 

w(a) =aeV", >A. (5.7) 
Meanwhile, for —A < x < A, every even solution is of the form 

(x) = bcos (Ve — ex) . (5.8) 


Proposition 5.2 Let ~ be the function defined in (5.6)-(5.8). Then there 
exist nonzero constants a and b so that w belongs to the domain of H if 
and only if the following matching condition holds: 


Ve = Ve—etan(/c— eA). (5.9) 


Proof. Clearly both w and d?y/dx? belong to L?(R). Thus, in light of 
Proposition 5.1, we need only ensure that w(x) and v’(a) are continuous 
at x = +A. Since the exponential functions are never zero, we may always 
ensure that ¢ itself is continuous by taking any value we like for b and then 
choosing a appropriately Once w has been made to be continuous, w’ will 
be continuous provided that w’(x)/7(x) has the same value as we approach 
+A from inside the well or from the outside. To obtain the condition (5.9), 
we compute w/w from (5.6) and then from (5.8), evaluate both quantities 
at x = —A, and then equate the two values of w/w. Because we have 
made our solution an even function, we get the same matching condition 
at = Aasatx=—aA. 

Now, in deriving (5.9), we implicitly assumed that ~ is nonzero at x = 
+A. We do not, however, get any nonzero solutions in which (+A) = 0. 
After all, at points where the cosine function in (5.8) is zero, its derivative 
is nonzero. But no choice of the constant in front of the exponentials (5.6) 
and (5.7) will produce a function that is zero but has a derivative that is 
nonzero. @ 

















Proposition 5.3 For all positive values of c and A, there exists at least 
one € € (0,c) such that (5.9) holds. 


Proof. Case 1: /cA < 7/2. In this case, as € varies between 0 and c, 
the left-hand side of (5.9) will vary between 0 and some positive number, 
whereas the right-hand side of (5.9) will vary between some positive number 
and 0. By the intermediate value theorem, there must exist ¢ € (0,c) for 
which (5.9) holds. See Fig. 5.2. 
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Case 2: \/cA > 7/2. In this case, there is €g € [0, c] for which /c — eo A = 
1/2. As € decreases from c to €9, the right-hand side of (5.9) will vary from 
0 to +oo. Thus, for ¢« slightly larger than €g, the right-hand side of (5.9) 
will be larger than the left-hand side. By the intermediate value theorem, 
there must exist € € (€0,c) for which (5.9) holds. See Fig. 5.3 for a case 
VcA slightly larger than 7/2 and Fig. 5.4 for a case with ,/cA much larger 
than 7/2. @ 

Note that if /cA is much larger than 7/2, then there will be multiple 
solutions of (5.9), as can be seen in Fig. 5.4. 

We have found, then, at least one solution w to (5.4) that satisfies the 
matching condition and for which both w and w” decay exponentially at 
infinity. Since this 7 belongs to the domain of H, we have established the 
following result. 








FIGURE 5.2. Solving the matching condition, Case 1. 


Proposition 5.4 For any positive values of A and C, there exists at least 
one value of E in the range —C < E < 0 for which (5.2) has a nonzero 
solution in the domain of H, given by the formula 


cos (/c— Ex) -A<a<A 
p(x) = 
cos (\/e — €A) exp[—ve(|z| — A)] |z| >A 


where c and € are defined in (5.5) and where € satisfies (5.9). 


In Proposition 5.4, we have not normalized w to be a unit vector in 
L?(R), but rather have normalized ~ to equal 1 at the origin. In Figs. 5.5— 
5.7, we plot our eigenfunction in several different cases. In Fig. 5.5, we have 
a “shallow” well, with /cA = 1. In that case, we obtain only one even 
eigenvector, which is the ground state of the system (i.e., the eigenvector 
with the smallest eigenvalue). Next, we consider a “deep” well, with /cA = 
30. For this well, the ground state is shown in Fig. 5.5 and an “excited state” 
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FIGURE 5.3. Solving the matching condition, Case 2a. 











FIGURE 5.4. Solving the matching conditions, Case 2b. 








FIGURE 5.5. Ground state for a shallow potential well. 
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FIGURE 5.6. Ground state for a deep potential well. 






























































FIGURE 5.7. Excited state for a deep potential well. 


(i.e., an eigenvector with an eigenvalue that is not the smallest) is shown 
in Fig. 5.7. 

Note that in the shallow well, the ground state extends quite a bit beyond 
the interval [—A, A], whereas in the deep well, the ground state goes to zero 
very quickly as soon as we move outside the well. On the other hand, the 
excited state in Fig. 5.7 extends comparatively far outside the well. 

It is straightforward to adapt the preceding analysis to the odd case. The 
matching condition (5.9) is replaced by 


Ve = —Ve—écot (Vc—€A) (5.10) 
(Exercise 2) and the formula for the eigenvectors is now 


sin (\/c— Ex) -A<a<A 


(2) = a: ; | 
+ sin (\/c — €A) exp[—/e(|z] — A)] |r| >A 





where we take the + sign for « > A and the — sign for x < —A. 
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FIGURE 5.8. Matching condition for odd solutions. 








FIGURE 5.9. An odd solution. 


If /cA < 7/2, then the matching condition (5.10) will have no solu- 
tions, since the right-hand side of (5.10) will be negative for all ¢ € (0,c). 
For large values of ,/cA, there will be several solutions to (5.10). A typical 
matching scenario and an associated eigenfunction are plotted in Figs. 5.8 
and 5.9. 


5.4 Tunneling and the Classically Forbidden 
Region 


Let us now briefly compare the classical situation to the quantum one. 
Classically, if a particle has energy E, then since the kinetic energy p?/(2m) 
is always non-negative, the particle simply cannot be located at a point x 
with V(x) > E. Thus, the region V(a) < EF may be called the “classically 
allowed” region and the region V(x) > E the “classically forbidden” region. 
In the case of a square well potential (5.1), if -C < E < 0, then the “well” 
itself (i.e., the region with —A < x < A) is the classically allowed region 
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and the outside of the well (i.e., the region with |z| > A) is the classically 
forbidden region. 

Quantum mechanically, if H w = Ev, then the particle has a definite 
value for the energy, namely E’. We see, however, that such a particle has 
a nonzero probability of being located in the classically forbidden region. 
Note that although the wave function is not zero in the classically forbidden 
region, it does decay exponentially with the distance from the classically 
allowed region. That is to say, the quantum particle can penetrate some 
distance into the classically forbidden region. Note, however, that if E is 
much less than zero—i.e., € is large—then a state with fil w = Ew will decay 
very rapidly outside the well (like exp[—/é(|z] — A)]). 

More generally, we can think about the time-dependent Schrodinger 
equation for a particle with energy approximately equal to EF. If we require 
that the energy be exactly equal to EF, then there is no interesting time- 
dependence, since the solution to the time-dependent Schrodinger equation 
is simply a constant time wo. We can, however, think of a particle where 
the uncertainty in the energy is nonzero but small. Suppose such a particle 
is traveling through a region with V < FE and then approaches a region 
with V > E (a “potential barrier”). Classically, the particle would just 
reflect off of this barrier and go back in the other direction. Quantum me- 
chanically, though, it is possible for the particle to “tunnel” through the 
potential barrier and come out the other side. That is to say, at some later 
time, there will be some non-negligible portion of the wave function on the 
far side of the barrier. 


5.5 Discrete and Continuous Spectrum 


Our analysis of the eigenvector equation (5.2) for —C < E < 0 shows that 
there are only finitely many values of F in this range for which we get 
square-integrable solutions. It is not hard to analyze the case E < —C 
with the result that all nonzero solutions grow exponentially in at least 
one direction (Exercise 6). Meanwhile, for EF > 0, any solution to (5.2) on 
(—oo, —A) has sinusoidal behavior and is not square-integrable unless it 
is identically zero, in which case (by our matching condition) the solution 
must be zero everywhere. 

The upshot is that we obtain only finitely many square-integrable so- 
lutions to (5.2), up to multiplying each solution by a constant. Clearly, 
then, the “true” eigenvectors for H li.e., the ones that actually belong to 
the Hilbert space L?(R)] cannot form an orthonormal basis for L?(R). 
Nevertheless, the spectral theorem (Chap.7) provides something like a 
orthonormal-basis decomposition of elements of L?(IR) in terms of the so- 
lutions to (5.2). A general element 7 of L?(R) will be a sum of two terms. 
The first term is a linear combination of the true (LZ?) eigenvectors for 
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H , which have E < 0. The second term is a continuous superposition 
(ie., an integral) of the non-square-integrable “generalized eigenvectors” 
with E> 0. 

In Chap. 9, we will introduce the notion of the spectrum of a (possibly 
unbounded) self-adjoint operator A. We will see that a number A belongs 
to the spectrum of A if for all « > O there exists a unit vector 7 in the 
domain of A for which ||Aw— z|| < ¢. In the case of the Hamiltonian 
operator H with a square well potential, it is not hard to show that every 
real number E with E > 0 belongs to the spectrum of H (Exercise 4.). 

It can be shown that if a number EF < 0 is not an eigenvalue (i.e., if there 
are no nonzero L? solutions to H w = Ew), then E is not an element of the 
spectrum of H. This result is hinted at by Exercise 5. Thus, the spectrum 
of H consists of a finite number of points in (—C,0) (at least one), together 
with the whole half line [0, 00). 


5.6 Exercises 


1. (a) Suppose ~ is a smooth function on each of the intervals 
(—o0, —A), (—A, A), and (A,oo) and that both 7 and y’ are 
continuous at x = A and at « = —A. Show that for any smooth 
function x with compact support, we have 


/ i eeutie= i] ” x(a)" (a) de, (5.11) 


—Cco —Co 





where we leave w(x) undefined at x = +A if the second deriva- 
tive does not exist at those points. (In light of Definition A.28, 
(5.11) means that the second derivative of w, in the distribution 
sense, is simply the function w’’.) 

Hint: Choose some interval [—R, R] with R > A containing the 
support of vy. Now use integration by parts separately on each 
of the intervals [—R,—A], [—A, A], and [A, R], paying careful 
attention to the boundary terms. 


oc 
he nl 


Suppose now that w is a smooth function on each of the inter- 
vals (—oo, —A), (—A, A), and (A,oo), and that both w and y’ 
have left and right limits at « = +A, but that, say, 7’ has a 
discontinuity at « = —A. Show that (5.11) has to be modified 
by adding a nonzero multiple of y(—A) to the right-hand side. 





2. Verify the matching condition (5.10) for odd solutions of the time- 
independent Schrodinger equation. 


3. Let w be a nonzero real number and consider a function of the form 


w(x) = acos(wx) + bsin(wa), 
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for real numbers a and b. If a and 6b are not both zero, show that for 
any A € R, we have 


B 
: 2 = 
yim ff w(x)” dz = +00. 
. Let f be a C™ function on the interval (0,1) with the property that 
f(a) =1 for 0 < a < 1/3 and f(x) = 0 for 2/3 < « < 1. Then define 
a family of “cutoff’ functions y, on R by the formula 





0 Le) = ee 
1 lz] <n 
i ee a ee, eee) eee 
f(a—n) n<u<ntl 


Given any EF > 0, let ~ be a nonzero solution to (5.2) for which (zx) 
and (x) are continuous at = +A. Let dp = WXn. Show that yn 
belongs to the domain of H and that 











Note: As we will see in Chap. 9, this implies that every real number 
E with E > 0 belongs to the spectrum of the operator H. 


Hint: In estimating ||¢,||, it may be helpful to apply Exercise 3 to 
the real and imaginary parts of w~ outside the well. 


. Suppose & < 0 and suppose that there exists no nonzero square- 
integrable solutions to (5.2) for which w and wv’ are continuous. Let ~ 
be a nonzero solution of (5.2) for which w(a) and ~'(a) are continuous 
at x = +A and let 7, be as in Exercise 4. Show that 





| Bn - Edn 
In| 


does not tend to zero as n tends to infinity. 








(a) Show that for E < —C, there are no nonzero square-integrable 
solutions to (5.2) for which q and y are continuous. 


(b) Obtain the result of Part (a) when E = —C. 
Hint: Analyze the even and odd cases separately. 
. Let the ground state for a particle in a square well denote the eigen- 


vector with the lowest (most negative) eigenvalue, which corresponds 
to the largest value for ¢. 
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(a) Show that the ground state is always an even function. That is 
to say, show that the largest value of ¢ satisfying (5.9) is always 


larger than any solution to (5.10). 
(b) Show that the ground state is a nowhere-zero function. 


6 


Perspectives on the Spectral Theorem 





6.1 The Difficulties with the Infinite-Dimensional 
Case 


Suppose A is a self-adjoint n x n matrix, meaning that A,; = Ajx for all 
1 < j,k <n. Then a standard result in linear algebra asserts that there 


exist an orthonormal basis {v5} fea for C” and real numbers )j,...,An 
such that Av; = A;v;. (See Theorem 18 in Chap. 8 of [24] and Exercise 4 
in Chap. 7.) 


We may state the same result in basis-independent language as follows. 
Suppose H is a finite-dimensional Hilbert space and A is a self-adjoint 
linear operator on H, meaning that (¢, AW) = (A¢,~) for all ¢,~ € H. 
Then there exists an orthonormal basis of H consisting of eigenvectors for A 
with real eigenvalues. 

Since there is a standard notion of orthonormal bases for general Hilbert 
spaces, we might hope that a similar result would hold for self-adjoint 
operators on infinite-dimensional Hilbert spaces. Simple examples, however, 
show that a self-adjoint operator may not have any eigenvectors. Consider, 
for example, H = L?({0,1}) and an operator A on H defined by 


(Ay)(2) = xy (a). (6.1) 


Then A satisfies (¢, AW) = (Ad,w) for all ¢,w € L7((0,1]), and yet A 
has no eigenvectors. After all, if x(x) = A~(x), then w would have to be 
supported on the set where x = A, which is a set of measure zero. Thus, 
only the zero element of L?((0,1]) satisfies Ay = Aw. 
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Now, a physicist would say that the operator A in (6.1) does have 
eigenvectors, namely the distributions 6(a — A). (See Appendix A.3.3.) 
These distributions indeed satisfy «d(a — A) = d(a@ — A), but they do 
not belong to the Hilbert space L?((0,1]). Such “eigenvectors,” which be- 
long to some larger space than H, are known as generalized eigenvectors. 
Even though these generalized eigenvectors are not actually in the Hilbert 
space, we may hope that there is some sense in which they form something 
like a orthonormal basis. See Sect. 6.6 for an example of how such a “basis” 
might function. 

Let us mention in passing that our simple expectation of a true orthonor- 
mal basis of eigenvectors is realized for compact self-adjoint operators, 
where an operator A on H is said to be compact if the image under A of 
every bounded set in H has compact closure; see Theorem VI.16 in Vol- 
ume I of [34]. The operators of interest in quantum mechanics, however, 
are not compact. (Of course, even if a self-adjoint operator is not compact, 
it might still have an orthonormal basis of eigenvectors, as, e.g., in the case 
of the Hamiltonian operator for a harmonic oscillator. See Chap. 11.) 

Meanwhile, there is another serious difficulty that arises with self-adjoint 
operators in the infinite-dimensional case. Most of the self-adjoint operators 
A of quantum mechanics are unbounded operators, meaning that there is 
no constant C’ such that ||Aw|| < C' |||] for all 7. Suppose, for example, 
that A is the position operator X on L?(R), given by (Xw)(x) = xv(z). If 
1g denotes the indicator function of EF (the function that is 1 on FE and 0 
elsewhere), then it is apparent that 


||X1pontayl] = 7 ||Lrnttl| 


for every positive integer n, and, thus, X cannot be bounded. Now, using 
the closed graph theorem and elementary results from Sect. 9.3, it can be 
shown that if A is defined on all of H and satisfies (¢, AW) = (Ad, w) for 
all ¢,~ € H, then A must be bounded. (See Corollary 9.9.) Thus, if A is 
unbounded and self-adjoint, it cannot be defined on all of H. 

We define, then, an “unbounded operator on H ” to be a linear operator 
from a dense subspace of H—known as the domain of A—to H. The no- 
tion of self-adjointness for such operators is more complicated than in the 
bounded case. The obvious condition, that (¢, Aw) should equal (Ad, v) for 
all ¢ and w in the domain of A, is not the “right” condition. Specifically, 
that condition is not sufficient to guarantee that the spectral theorem ap- 
plies to A. Rather, for any unbounded operator A, we will define the adjoint 
A* of A, which will be an unbounded operator with its own domain. An 
unbounded operator is then defined to be self-adjoint if the domains of A 
and A* are the same and A and A* agree on their common domain. That 
is to say, self-adjointness means not only that A and A* agree whenever 
they are both defined, but also that the domains of A and A* agree. 
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6.2 The Goals of Spectral Theory 


Before getting into the details of the spectral theory, let us think for a 
moment about what it is we want the spectral theorem to do for us. In the 
first place, we would like the spectral theorem to allow us to apply various 
functions to an operator. We saw, for example, that the time-dependent 
Schrédinger equation can be “solved” by setting (t) = exp{—itH/h}wo. 
Because the Hamiltonian operator H is unbounded, it is not convenient 
to use power series to define the exponential. If, however, H has a true 
orthonormal basis {e;} of eigenvectors with corresponding eigenvalues A, 
then we can define exp{—itH/h} to be the unique bounded operator with 
the property that : 
eit /lig, — pe ttn/Tg, 


for all k. 

In cases where H does not have a true orthonormal basis of eigenvectors, 
we would like the spectral theorem to provide a “functional calculus” for 
H, that is, a system for applying functions (including exponentials) to H. 
This functional calculus should have properties similar to what we have in 
the case of a true orthonormal basis of eigenvectors. 

In the second place, we would like the spectral theorem to provide a 
probability distribution for the result of measuring a self-adjoint opera- 
tor A. Let us recall how measurement probabilities work in the case that 
A has a true orthonormal basis {e;} of eigenvectors with eigenvalues \;. 
Building on Example 3.12, we may compute the probabilities in such a case 
as follows. Given any Borel set EF of R, let Vz be the closed span of all the 
eigenvectors for A with eigenvalues in £, and let Pr be the orthogonal 
projection onto Vg. Then for any unit vector ~, we have 


proby(A € E) = (wb, Pew). (6.2) 


In particular, if the eigenvalues are distinct and w decomposes as Wy = 
>=; ¢7ej, the probability of observing the value Aj will be le;|? (as in Ex- 
ample 3.12), since Pry,} 18 just the projection onto e;. 

In cases where A does not have a true orthonormal basis of eigenvectors, 
we would like the spectral theorem to provide a family of projection oper- 
ators Pg, one for each Borel subset E C R, which will allow us to define 
probabilities as in (6.2). We will call these projection operators spectral 
projections and the associated subspaces Vg spectral subspaces. (Thus, Pr 
is the orthogonal projection onto Vz.) Intuitively, Ve may be thought of as 
the closed span of all the generalized eigenvectors with eigenvalues in E. 

In the first version of the spectral theorem, both these goals will be 
achieved, with the spectral projections being provided by a projection- 
valued measure and the functional calculus being provided by integration 
with respect to this measure. Although having (generalized) eigenvectors 
for a self-adjoint operator is, from a practical standpoint, of secondary 
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importance, we provide a framework for understanding such eigenvectors, 
using the concept of a direct integral. The second version of the spectral 
theorem decomposes the Hilbert space H as a direct integral, with respect 
to a certain measure ps, of generalized eigenspaces for a self-adjoint oper- 
ator A. The generalized eigenspace for a particular eigenvalue \ will not 
actually be a subspace of H, unless ju({A}) > 0. Thus, the notion of a direct 
integral gives a rigorous meaning to the notion of “eigenvectors” that are 
not actually in the Hilbert space. 


6.3 A Guide to Reading 


Although the portion of this book devoted to spectral theory is unavoidably 
technical in places, it has been designed so that the reader can take in as 
much or as little as desired. The reader who is willing to take things on faith 
can simply take in the examples of the position and momentum operators 
in Sects. 6.4 and 6.6 and accept these as prototypes of how the spectral 
theorem works. The reader who wants more details can find the statement 
of the spectral theorem for bounded operators, in two different forms, in 
Chap.7, and can find the basics of unbounded self-adjoint operators in 
Chap. 9. Finally, the reader who wants a complete treatment of the subject 
can find full proofs of the spectral theorem in both forms, first for bounded 
operators in Chap. 8, and then for unbounded operators in Chap. 10. 


6.4 The Position Operator 


As our first example, let us consider the position operator X, given by 
(Xv)(x) = x(x), acting on the Hilbert space H = L?(R). As for the 
similar operator in Sect.6.1, X has no true eigenvectors, that is, no eigen- 
vectors that are actually in H. If we think that the generalized eigenvectors 
for X are the distributions 6(2— A), A € R, then we may make an educated 
guess that the spectral subspace Vz should consist of those functions that 
“supported” on FE, that is, those that are zero almost everywhere on the 
complement of £. (A superposition of the “functions” 6(a— A), with A € EF, 
should be a function supported on E.) 

The spectral projection Pg is then the orthogonal projection onto Vz, 
which may be computed as 


Prep = ley, 


where lg is the indicator function of FE. In that case, we have, follow- 
ing (6.2), 


proby (X € E) = (Paw) = f Ia)? ae. 
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This formula is just what we would have expected from our discussion in 
Chap. 3, where we claimed that the probability distribution for the position 
of the particle is |¢(«)|?. 

Meanwhile, let us consider the functional calculus for X. If f(A) = A™, 
then f(X) should be just the mth power of X, which is multiplication by 
z™. It seems reasonable, then, to think that for any function f, we should 
define f(X) to be simply multiplication by f(z). In particular, the operator 
e’** should be simply multiplication by e’“*, which is a bounded operator 
on L?(R). 


tax 


6.5 Multiplication Operators 


Since the position operator acts simply as multiplication by the function 
x, it is straightforward to find the spectral subspaces and also to construct 
the functional calculus for X. We may consider multiplication operators in 
a more general setting. If H = L?(X,) and h is a real-valued measurable 
function on X, then we may define the multiplication operator M;, on 
1?(X, 1) by 

Mny = hap. 


We can then construct spectral subspaces as 
Ve = {wu | is supported on h~'(E)} 
and define a functional calculus by 
f(A) = multiplication by fo h. 


One form of spectral theorem may now be stated simply as follows: A 
self-adjoint operator A on a separable Hilbert space is unitarily equivalent 
to a multiplication operator. That is to say, there is some o-finite mea- 
sure space (X,j) and some measurable function h on X such that A is 
unitarily equivalent to multiplication by h. (See Theorem 7.20.) Although 
this version of the spectral theorem is compellingly easy to state, there is 
slight modification of it, involving direct integrals, that is in some ways 
even better. See Sect. 7.3 for more information. 


6.6 The Momentum Operator 


Let us now see how the spectral theorem works out in the case of the 
momentum operator, P = —ih d/dx on L?(R). The “eigenvectors” for 
P are the functions e**, k € R, with the corresponding eigenvalues be- 
ing hk. Although the functions e’** are not in L?(R), the Fourier trans- 
form shows that any function in L?(IR) can be expanded as a superposition 
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(i.e., continuous version of a linear combination) of these functions. (See 
Appendix A.3.2.) Indeed, the Fourier transform is very much like the de- 
composition of a vector in an orthonormal basis, in that the Fourier coeffi- 
cients 7(k) can be expressed in terms of the “inner product” of a function 
w with e***: 


DE) = ny? fet y(e) de = ny V? (eV) rag), 


—oo 
if we ignore the fact that e’** is not actually in L?. 
Indeed, physicists frequently understand the Fourier transform by assert- 
ing that the functions elke / /2n form an “orthonormal basis in the contin- 
uous sense” for L?(R). Orthonormality in the continuous sense is supposed 
to mean that one replaces the usual Kronecker delta in the definition of an 
orthonormal set by the Dirac 6-function 


eike eilx 
Cee 6.3 
(= TE sow 2 “ 








where 0 is supposed to satisfy 
[Fe ak = F@ 


for all continuous functions f. (Rigorously, 6(k — 1) is a distribution; see 
Appendix A.3.3.) 

To give some rigorous meaning to (6.3), note that although the inner 
product. of e’** and e%* is not defined, we may approximate this inner 
product by the expression 


1 


1 etk-De|* Asin [A(k — D)] 
QT —A 


—ika vila eearte =o . 
ee eae a 
It is possible to show that the above function, viewed as a function of k for 
fixed A and I, behaves like 6(k/ —1) in the limit as A tends to infinity. That 
is to say, for all sufficiently nice functions ~, we have 


. ad Asin [A(k — 1)] 
jm, WW) AG = v(/). (6.4) 
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Here is a heuristic argument for (6.4). By making the change of variable 
k! = k—1, we may reduce the general problem to the case | = 0. If we then 
make the change of variable « = Ak, the desired result is equivalent to 





im J er) dk = f(0). (6.5) 


Asto J.T K A 


6.6 The Momentum Operator 129 


Now, if we can bring the limit inside the integral, f(«/A) will tend to f(0) 
as A tends to infinity. Since the rest of the integrand on the right-hand 
side of (6.5) is already independent of A, the result would then follow if we 


could show that 
/ ae ae (6.6) 


=e. 





Even though the integral in (6.6) is not absolutely convergent, it is a con- 
vergent improper integral. The value of the integral can be obtained by the 
method of contour integration (or the method of consulting a table of in- 
tegrals), and indeed (6.6) holds. Since (6.3) is, in any case, only a heuristic 
way of thinking about the Fourier transform, we will not take the time to 
develop a rigorous version of the preceding argument. 

It is possible to derive, at least formally, many of the standard properties 
of the Fourier transform by using (6.3), just as one can obtain properties 
of Fourier series by using the orthonormality of the functions e?7’"” in 
L?([0,1]). More importantly, the Fourier transform is precisely the unitary 
transformation that changes the momentum operator into a multiplication 
operator. To see this property of the Fourier transform more clearly, we 
introduce a simple rescaling of it. 


Definition 6.1 For any w € L?(R), define wb by 


so that 
“ 1 


W(p) = Tank 


The function w(p) is the momentum wave function associated with w. 





a ec PtlPab(n) da. 


By the Plancherel theorem (Theorem A.19) and a change of variable, if ~ 
is a unit vector, then so is ob and also w. For any unit vector w, we interpret 
\¢)(p)|? as the probability density for the momentum of the particle, just as 
\¢(a)|? is the probability distribution of the position of the particle. Using 
Proposition A.17, we may readily verify that for nice enough w, we have 


Pip) = pu(p). (6.7) 


Equation (6.7) means that the unitary map ~ > ~ turns the momentum 
operator P into multiplication by p. That is to say, the spectral theorem, 
in its “multiplication operator” form, is accomplished in this case by the 
Fourier transform (scaled as in Definition 6.1). 

In terms of the momentum wave function, we may define spectral pro- 
jections and a functional calculus for P, just as in Sect.6.5. For any Borel 
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set E C R, we may define a projection Pg to be the orthogonal projection 
onto to the space of functions q for which y(p) is zero almost everywhere 
outside of E. If f is any bounded measurable function on R, we can define 
an operator f(P) by defining f(P)z to be the unique element of L?(R) for 
which 


aS 


f(P)v(p) = F(p)b(p). 


7 


The Spectral ‘Theorem for Bounded 
Self-Adjoint Operators: Statements 


In the present chapter, we will consider the spectral theorem for bounded 
self-adjoint operators, leaving a discussion of unbounded operators to 
Chaps. 9 and 10. The proofs of the main theorems (two different versions 
of the spectral theorem) are moderately long and are deferred to Chap. 8. 
After some elementary definitions and results in Sect. 7.1, we come to the 
main results in Sects. 7.2 and 7.3. Throughout the chapter, H will, as usual, 
denote a separable Hilbert space over C. 


7.1 Elementary Properties of Bounded Operators 


As usual, we will let H denote a separable complex Hilbert space. Recall 
from Appendix A.3.4 that a linear operator A on H is said to be bounded 
if the operator norm of A, 


| Avl| 


avn (7.1) 
veH\{o} IPI 


I|Al| == 


is finite. The space of bounded operators on H forms a Banach space under 
the operator norm, and we have the inequality 


| AB < |All 121 (7.2) 
for all bounded operators A and B. 


Definition 7.1 The Banach space of bounded operators on H, with respect 
to the operator norm (7.1), is denoted B(H). 
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Recall (Appendix A.4.3) that for any A € B(H) there is a unique operator 
A* © B(B), called the adjoint of A, such that 
(¢, Av) = (A*¢, #) 
for all ¢,a € H. An operator A € B(H) is called self-adjoint if A* = A. 
We say that A € B(H) is non-negative if 
(b, Ap) = 0 (7.3) 
for alld € H. 
Proposition 7.2 For all A € B(H), we have 
| A*|| = ||Al 
and 
: 2 
| A* Al] = |All”. 
In particular, if A is self-adjoint, we have the useful result that || A?|| a 
|All’. 
Proof. The operator norm of A can also be computed as 


||Al| = sup ||Ay||. 
Ivll=a 


Furthermore, for any vector ¢ € H, ||¢|| = supy,y=i |(x, ¢)|- (Inequality 
one direction is by the Cauchy—Schwarz inequality, and inequality the other 
direction is by taking y to be a multiple of ¢.) Thus, 


|All= sup |(¢, Ay)I. 
Woll=lsl=1 


From this, we get 


|A*|= sup |(¢, A*Y)| 
Woll=lsl=1 


= sup |(A¢, *)| 
Woll=lsl=1 


= sup _—_ |(b, Ag)| 
Woll=lsl=1 


= |All. 
Meanwhile, ||.A*A|| < ||.A*|| || Al] = || Al|?. On the other hand, 
||A*Al|= sup —_|(¢, A* Ay)| 
lellsllell=1 


= sup |(Ag, Ad)| 
oll=lbl=1 


= sup |(Ad, Ay)| 
Isl|=1 


= |All’, 
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which establishes the inequality in the other order. m 
We now record an elementary but very useful result. 


Proposition 7.3 For all A ¢ B(H), we have 
[Range(A)]~ = ker(A’*), 
where for any B € B(H), ker(B) denotes the kernel of B. 


Proof. Suppose first that ~ belongs to [Range(A)]~. Then for all ¢ € H, 
we have 


0 = (b, Ad) = (AY, 9). (7.4) 
This implies that A*w = 0 and thus that a € ker(A*). Conversely, suppose 
w € ker(A*). Then for all ¢ € H, (7.4) holds (reading the equation from 
right to left). This shows that ~ is orthogonal to every element of the form 
Ad, meaning that w € [Range(A)]>. = 
Next, we define the spectrum of a bounded operator, which plays the 
same role as the set of eigenvalues in the finite-dimensional case. 


Definition 7.4 For A € B(H), the resolvent set of A, denoted p(A) 
is the set of all \ € C such that the operator (A — AI) has a bounded 
inverse. The spectrum of A, denoted by o(A), is the complement in C of 
the resolvent set. For X in the resolvent set of A, the operator (A— X\I)~+ 
is called the resolvent of A at X. 


Saying that (A — AJ) has a bounded inverse means that there exists a 
bounded operator B such that 


(A— IB = B(A-Al =I. 


If A is bounded and A — AI is one-to-one and maps H onto H, then it 
follows from the closed graph theorem (Theorem A.39) that the inverse 
map must be bounded. Thus, the resolvent set of A can alternatively be 
described as the set of A € C for which A — AI is one-to-one and onto. 


Proposition 7.5 For all A € B(H), the following results hold. 


1. The spectrum o(A) of A is a closed, bounded, and nonempty subset 
of C. 


2. If |A| > ||All, then A is in the resolvent set of A. 


Lemma 7.6 Suppose X € B(H) satisfies ||X|| < 1. Then the operator 
I-X is invertible, with the inverse given by the following convergent series 
(I-X) tt H=T4+X4 X74 X34... (7.5) 


134 7. The Spectral Theorem for Bounded Self-Adjoint Operators... 


Proof. As a consequence of (7.2), we have || X™|| < ||X||’". The (geometric) 
series on the right-hand side of (7.5) is therefore absolutely convergent and 
thus convergent in the Banach space B(H) (Appendix A.3.4). If we multiply 
this series on either side by ([— X), everything will cancel except I, showing 
that the sum of the series is the inverse of (I — X). ™ 

Proof of Proposition 7.5. For any nonzero \ € C, consider the operator 


A 
a-ar=-a(1-4). 





If |A| > ||Al], then ||A/A]| < 1, and I — A/, is invertible by the lemma. It 
then follows that A — XJ is invertible, with 


(A-—dAI) t= e(145454). (7.6) 





A » 


Thus, A is in the resolvent set of A. This establishes Point 2 in the propo- 
sition and shows that o(A) is bounded. 

Suppose now that Ap € C is in the resolvent set of A. Then for another 
number A € C, we have 


Asdt=A=at =O =Ap2 
=(A=A0 i= gad, (7.7) 


Thus, if 
|r Ao| < 





1 
(A — AoZ)-* 


both factors on the right-hand side of (7.7) will be invertible, so that A—AI 
is also invertible. Thus, the resolvent set of A is open and the spectrum is 
closed. 

To show that o(A) is nonempty, note that A — AI may be computed as 
follows: 


(A— AI) = (L— (A= Ao) (A — Aol) “*)-* (A = Aol) * 


= (> (A= Ao) ((A — sot") (A—ApI)-*. (7.8) 


m=0 


Thus, near any point \o in the resolvent set of A, the resolvent (A — AJ)~+ 
can be computed by the locally convergent series (7.8) in powers of A— Ao, 
with the coefficients of the series being elements of B(H). For any ¢,) € H, 
the map 

AH (¢,(A-AI)“*Y) (7.9) 


will be given by a locally convergent power series with coefficients in C, 
meaning that the function (7.9) is a holomorphic function on the resolvent 
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set of A. Furthermore, from (7.6) we can see that || (A - AI)~*|| tends to 
zero as |A| tends to infinity, and so also does the right-hand side of (7.9). 

If o(A) were the empty set, the function (7.9) would be holomorphic 
on all of C and tending to zero at infinity. By Liouville’s theorem, the 
right-hand side of (7.9) would have to be identically zero for all @ and 
w, which would mean that (A — AI)! is the zero operator. But since 
(A — AI)(A— AI)~! = I, the operator (A — \J)~! cannot be zero. 

If Aw = AW for some \ € C and some nonzero w € H, then (A — AZ) has 
a nonzero kernel and so J is in the spectrum of A. Thus, any eigenvalue 
for A is contained in the spectrum of A. In the infinite-dimensional case, 
however, the converse is not true: A point in the spectrum may not be an 
eigenvalue for A. Nevertheless, for a bounded self-adjoint operator A, the 
spectrum of A may be described in a way that is not too far removed from 
what we have in the finite-dimensional case. 


Proposition 7.7 If A € B(H) is self-adjoint, then the following results 
hold. 


1. The spectrum of A is contained in the real line. 


2. A number A € R belongs to the spectrum of A if and only if there 
exists a sequence Wy, of nonzero vectors in H such that 


mM. -———__, = 


n—00 ||| 


0. (7.10) 


Condition 2 in the proposition says that A € R belongs to the spectrum 
if and only if A is “almost an eigenvalue,” meaning that there exists 7) #4 0 
for which Aw is equal to Aw plus an error that is small compared to the 
size of w. 


Lemma 7.8 Jf A © B(H) is self-adjoint, then for all X =a+ibeC, we 
have 


(A — AD), (A — AD) > 0? (eh, b) . (7.11) 
Proof. We compute that 
((A — (a + id)I)y, (A — (a + id)I)Y) 
= ((A— al), (A — al)y) + ib (yp, (A — al) yp) 
— ib (A — al), v) +0? (b, ¥). (7.12) 


Since A is self-adjoint, so is A— aI, from which we see that the second and 
third terms on the right-hand side of (7.12) cancel, leaving us with 


(A — AT), (A — AD)p) = ((A — al )eh, (A — at )h) + 8 (, 9), 


from which the desired inequality follows. m 
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Proof of Proposition 7.7. For Point 1, we need to show that any complex 
number \ = a+ ib with 6 ¥ 0 belongs to the resolvent set of A. Since 
b £0, (7.11) shows that A— XI is injective. Meanwhile, by Proposition 7.3, 
Range(A — \J)+ = ker(A — AZ). Since X also has nonzero imaginary part, 
A — XI is injective, and so the range of A — XI is dense in H. To show 
that the range is all of H, consider any ¢ € H and choose a sequence 
on = (A— AT) Wy in Range(A — AJ) with ¢, > ¢. Applying (7.11) with w 
replaced by W, — Wm shows that (w,,) is a Cauchy sequence. Thus, w, > w 
for some Ww € H. Since A is bounded, 


(A— Ab = lim (A AZ) = lim bn =o. 


We conclude, then, that A—XI is one-to-one and onto. The inverse operator 
(A — AI)~! is bounded, by (7.11) (or by the closed graph theorem). 

For Point 2, assume there exists a sequence as in (7.10), and suppose that 
A~—XAI had an inverse. Letting ¢, = (A—AI)yn, we have Wp = (A—AI)~! bn 
and so (7.10) says that 


fin lige 
nara [A= MG 


which shows that (A — AJ)~! is actually unbounded. Thus, A — AI cannot 
have a bounded inverse. 

Conversely, if, for some \ € R, no such sequence exists, then there exists 
some € > 0 such that 


0, 


(A — ADI] > € [lv (7.13) 
for all w € H. Then A — XI is injective and Proposition 7.3 tells us that 
the range of the self-adjoint operator A — XI is dense in H. Arguing as in 
the preceding paragraphs with (7.13) in place of (7.11), we can see that the 
range of A — AJ is also closed, hence all of H. This shows that A — AJ has 
an inverse. 


Example 7.9 Let H = L?((0,1}) and let A be the operator on H defined 
by 
(Ay) (x) = av (a). 


Then this operator is bounded and self-adjoint, and its spectrum is given by 
a(A) = (0, 1]. 


As we have already noted in Sect. 6.1, the operator A does not have any 
(true) eigenvectors. 
Proof. It is apparent that || Ay|| < |||] and that (¢, Aw) = (A¢@, 7) for all 
¢,w € H, so that A is bounded and self-adjoint. Given » € (0,1), consider 
the functions Y, := 11,,41/n], which satisfy ll ||? = 1/n. On the other 
hand, since |” — A] < 1/n on [A, A+ 1/n], we have 


(A = AD dll? < 1/n?. 
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Thus, by Proposition 7.7, A belongs to the spectrum of A. Since this holds 
for all \ € (0,1) and the spectrum of A is closed, o(A) D [0, 1]. 

Meanwhile, if \ ¢ [0,1], then the function 1/(# — A) is bounded on 
(0, 1], and so A— AT has a bounded inverse, consisting of multiplication by 
1/(a— X). Thus, o(A) = [0,1]. @ 


7.2 Spectral Theorem for Bounded Self-Adjoint 
Operators, I 


7.2.1 Spectral Subspaces 


Given a bounded (for now) self-adjoint operator A, we hope to associate 
with each Borel set E C o(A) a closed subspace Vg of H, where we think 
intuitively that Vz is the closed span of the generalized eigenvectors for A 
with eigenvalues in E. [We could do this more generally for any EF C R, 
but we do not expect any contribution from R\o(A).] We would expect the 
collection of these subspaces to have the following properties. 


1. Vo(A) = H and Va = {0}. 
2. If FE and F are disjoint, then Vg L Vr. 
3. For any F and F, Veqnr = Ve N Vr. 


4. If Ey, B,... are disjoint and H = U,;E;, then 


Ve = QV5,. 
j 


5. For any FE, Vg is invariant under A. 
6. If Ec [Ao — €, Ao + €] and w € Vz, then 


(A — AoL)vl] < Elly. 


The condition V,(4) = H captures the idea that our generalized eigenvec- 
tors should span H, while Property 2 captures the idea that our generalized 
eigenvectors should have some sort of orthogonality for distinct eigenval- 
ues, even if they are not actually in the Hilbert space. In Property 4, there 
may be infinitely many of the F;’s, in which case, the direct sum is in the 
Hilbert space sense (Definition A.45). Properties 5 and 6 capture the idea 
that Vz is made up of generalized eigenvectors for A with eigenvalues in F. 
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7.2.2 Projection- Valued Measures 


It is convenient to describe closed subspaces of a Hilbert space H in terms of 
the associated orthogonal projection operators. Recall (Proposition A.57) 
that, given a closed subspace V of H, there exists a unique bounded op- 
erator P that equals the identity on V and equals zero on the orthogonal 
complement V+ of V. This operator is called the orthogonal projection 
onto V and satisfies P? = P and P* = P. The following definition ex- 
presses the first four properties of our spectral subspaces—the ones that 
do not involve the operator A—in terms of the corresponding orthogonal 
projections. Since those properties are similar to those of a measure, we 
use the term projection-valued measure. 


Definition 7.10 Let X be a set and Q ao-algebra in X. A map w:Q-> 
B(H) is called a projection-valued measure if the following properties 
are satisfied. 


1. For each FE €Q, u(E) is an orthogonal projection. 
2. u(@) =0 and u(X) =I. 
3. If Fy, Eo, E3,... in Q are disjoint, then for allv © H, we have 


HUE} v= do (Ee, 
j=l j=l 


where the convergence of the sum is in the norm topology on H. 
4. For all Ey, Fy € Q, we have u(E, 9 E2) = w( Ey) u(E2). 


Note that if FE, and EF are disjoint, then Properties 2 and 4 tell us 
that u(£1)u(E2) = 0, from which it follows (Exercise 10) that the range 
of w(F,) and the range of y(E2) are perpendicular. It is then not hard to 
verify that u(E1) (£2) is the projection onto the intersection of the ranges 
of (£,) and p(F2) (Exercise 11). Thus, if we define, for each EF € Q, a 
closed subspace Vg := Range((£)), then the collection of Vg’s satisfy the 
first four properties that we anticipated for spectral subspaces. 

In the next subsection, we will associate a projection-valued measure p4 
with each bounded self-adjoint operator A. In that case, the projection 
uA(E) will be thought of as a projection onto the spectral subspace cor- 
responding to E. We are about to introduce the notion of operator-valued 
integration with respect to a projection-valued measure. In the case of the 
projection-valued measure p“ associated with A, this operator-valued in- 
tegral will be the functional calculus for A. 

Observe that, for any projection-valued measure p and w € H, we can 
form an ordinary (positive) real-valued measure 4, by setting 


by (E) = (ob, wE)p) (7.14) 
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for all E € Q. This observation provides a link between integration with 
respect to a projection-valued measure and integration with respect to an 
ordinary measure. 


Proposition 7.11 (Operator-Valued Integration) Let 9 be a c-alge- 
bra in a set X and let uw: Q— B(H) be a projection-valued measure. Then 
there exists a unique linear map, denoted f tes f du, from the space of 
bounded, measurable, complex-valued functions on Q into B(H) with the 


property that 
du = du. 7.15 
(w, (| id ) ) i f dy ( ) 


for all f and all © H, where py is given by (7.14). This integral has the 
following additional properties. 


1. For all E € Q, we have 


| lp du = p(£). 
x 


In particular, the integral of the constant function 1 is I. 


2. For all f, we have 
| [4 aus sue ro. (7.16) 
x AEX 


3. Integration is multiplicative: For all f and g, we have 


[ivi-(f.r0) (fo) em 


4. For all f, we have 


[fam (f fan). 


In particular, if f is real-valued, then te f dp is self-adjoint. 


By Property 1 and linearity, integration with respect to yw has the ex- 
pected behavior on simple functions. It then follows from Property 2 that 
the integral of an arbitrary bounded measurable function f can be computed 
as follows. Take a sequence s,, of simple functions converging uniformly to 
f; the integral of f is then the limit, in the operator norm topology, of the 
integral of the s,,’s. 

Although the multiplicative property of the integral may seem surprising 
at first, observe that for any E,, Fz € 2, Property 3 in Definition 7.10 tells 


140 7. The Spectral Theorem for Bounded Self-Adjoint Operators... 


(/. ra an) ¢ In, an) = p(Ey)u(E2) = w(E1 0 E2) 


=i lz, : lp, dy. 
x 


Thus, multiplicativity of the integral at the level of indicator functions is 
built into the definition of a projection-valued measure. 

If one wanted to make a real-valued measure for which the corresponding 
integral was multiplicative, then since 1g- 1g = 1g, the integral of 1_~@— 
namely, (£)—would have to satisfy (FE)? = yw(E£). This would mean 
that y(£) is 0 or 1 for all E. For such measures, one would indeed obtain 
multiplicativity of the integral, but measures with this property are not 
very interesting. For operator-valued measures, we can have interesting 
examples where the integral is multiplicative, simply because there are 
many more idempotents (elements A with A? = A) in B(H) than in R. 
Proof of Proposition 7.11. Given a projection-valued measure js and a 
bounded measurable function f on X, define a map Qf : H > C by 


us that 


Q;(v) = ie: ding, 


where jiy is given by (7.14). If f is an indicator function, then Q;(w) = 
(W, 4(E)~) is a bounded quadratic form. (See Definition A.60.) It is straight- 
forward to show, passing from indicator functions to simple functions and 
then to general functions, that for any bounded measurable f, Qy is a 
bounded quadratic form, with 


lQr(Y)I S (sup i701) ir’. (7.18) 


It then follows from Proposition A.63 that there is a unique bounded 
operator Ay such that 


Q;() = (, py) 


for all » € H. We set fy f dw = Ay. From the way Af is defined, it 
satisfies (7.15). The uniqueness of the linear map f + J, f dy follows 
from the uniqueness in Proposition A.63. 

If f =1n, then Q-(W) = ty (E£) = (v, u(E)y), in which case the unique 
associated operator A; is u(E). This establishes Property 1. Property 2 
follows from (7.18). 

For Property 3, we have already observed that multiplicativity of the 
integral, at the level of indicator functions, is built into the definition of a 
projection-valued measure. Since both sides of (7.17) are bilinear in (¢, 7), 
we have (7.17) for simple functions. Using Property 2, we can then ob- 
tain (7.17) for all bounded measurable functions by taking limits. 
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Finally, if f is real valued, then Q /(w) will be real for all YW € H. Thus, by 
Proposition A.63, the associated operator Ay will be self-adjoint. Property 
4 then follows by linearity. m 


7.2.38 The Spectral Theorem 


We are ready to state one version of the spectral theorem for bounded 
self-adjoint operators. 


Theorem 7.12 (Spectral Theorem, First Form) /f A € B(H) is self- 
adjoint, then there exists a unique projection-valued measure 4 on the 
Borel o-algebra in o(A), with values in projections on H, such that 


fl d duA(d) = A. (7.19) 
a(A) 


Since the spectrum o(A) of A is bounded, the function f(A) := A is 
bounded on o(A). The proof of this theorem is given in Chap. 8. 


Definition 7.13 (Functional Calculus) Jf A € B(H) is self-adjoint and 
f : o(A) > C is a bounded measurable function, define an operator f(A) 
by setting 
f(A) = FO) dpAQ), 
(A) 


where p“ is the projection-valued measure in Theorem 7.12. 


We may extend the projection-valued measure “4 from o(A) to all of 
R by assigning measure 0 to R \ o(A). Then, roughly speaking, f(A) is 
the operator that is equal to f(A)I on the range of the projection operator 
pA([A, A+ dd)). 

Since the integral with respect to 4 is multiplicative, it follows from 
(7.19) that if f(A) = A” for some positive integer m, then f(A) is the 
mth power of A. Further, since the series e** = }>>°_,(a\)™/m! converges 
uniformly on the compact set o(A), the operator e*4 (computed using the 
functional calculus for the function f(A) = e*4) may be computed as a 
power series. 


Definition 7.14 (Spectral Subspaces) For A € B(H), let ps4 be the 
associated projection-valued measure, extended to be a measure on R by 
setting w4(R \ o(A)) = 0. Then for each Borel set E C R, define the 
spectral subspace Vr of H by 


Vi = Range(u4(E)). 


The definition of a projection-valued measure implies that these spectral 
subspaces satisfy the first four properties listed in Sect. 7.2.1. We now show 
that (7.19) implies the remaining two properties we anticipated for the 
spectral subspaces. 
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Proposition 7.15 If A € B(H) is self-adjoint, the spectral subspaces as- 
sociated with A have the following properties. 


1. Each spectral subspace Vg is invariant under A. 


2. If EC [Xo — €, Ao + €] then for all © € Ve, we have 
(A — Aol) vl < elle. 


3. The spectrum of Alv. is contained in the closure of E. 


4. If Xo is in the spectrum of A, then for every neighborhood U of Xo, 
we have Vy # {0}, or, equivalently, u(U) 4 0. 


Proof. For Point 1, observe that for any bounded measurable functions f 
and g on o(A), the operators f(A) and g(A) commute, since the product 
in either order is equal to the integral of the function fg = gf with respect 
to y4. In particular, A, which is the integral of the function f(A) = A, 
commutes with y4(E£), which is the integral of the function 1g. Thus, 
given a vector “(E)¢ in the range of w4(E), we have 


ApA(E)¢ = uA(E)A¢, 


which is again in the range of w4(E£), establishing the invariance of the 
spectral subspace. 

For Point 2, suppose that w € Vg, where E C [Ap — €, Ao +]. Then w is 
in the range of 4(E), and so 


(A — Aol) = (A — Ao) uA (EB). 


But p4(E) = 1g(A) and A— A pI = f(A), where f(A) = A— Ao. By the 
multiplicativity of the integral, then, 


(A— Aol) = (f1z)(A)y. 


But |f(A)le(A)| < © and so by (7.16), the operator (fl@)(A) has norm at 
most é. 

For Point 3, if Ao is not in FE, then the function g(A) := 1g(A)(1/(A—Ao)) 
is bounded. Thus, g(A) is a bounded operator and 


g(A)(A — Aol) = (A — Aot)g(A) = 12 (A). 


This shows that the restriction to Vg of g(A) is the inverse of the restriction 
to Ve of A. Thus, Xo is not in the spectrum of Al,,.. 

For Point 4, fix Ao € o(A) and suppose for some € > 0, we have pu((Ao — 
€, Ao + €)) = 0. Consider, then, the bounded function f defined by 


1 
_JS xu A-Ad ze 
fa) ={ ‘a: oe 
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Since f(A) - (A — Ao) equals 1 except on (Ap — €,Ao + €), the equation 
f(A): (A= Ao) = 1 holds p-almost everywhere. Thus, the integral of this 
function coincides with the integral of the constant function 1, which is J. 
Since the integral is multiplicative, we see that 


F(A)(A— Aol) = (A- Aol) F(A) =F, 
showing that the bounded operator f(A) is the inverse of (A — Aol). This 


contradicts the assumption that Ag € o(A). 


Proposition 7.16 If A € B(H) its self-adjoint and B € B(H) commutes 
with A, the following results hold. 


1. For all bounded measurable functions f on o(A), the operator f(A) 
commutes with B. 


2. Each spectral subspace for A is invariant under B. 


The proof of this proposition is deferred until Chap. 8. We conclude this 
section by fulfilling (at least for bounded self-adjoint operators) one of 
the goals of the spectral theorem, namely to give a probability measure 
describing the probabilities for measurements of a self-adjoint operator A 
in the state w. 


Proposition 7.17 Suppose A € B(H) is self-adjoint and w € H is a unit 
vector. Then there exists a unique probability measure us on R such that 


[2 ah) = wav) 
R 
for all non-negative integers m. 


We will prove a version of Proposition 7.17 for unbounded self-adjoint 
operators in Chap. 9. In the unbounded case, however, we will not obtain 
uniqueness of the probability measure, even if w is in the domain of A’ for 
all m. Even in the unbounded case, however, the spectral theorem provides 
a canonical choice of the probability measure. 

Proof. We define a measure ua on o(A) as in Sect. 7.2.2 by 


us (E) = (ab, wA(E)p). 


The properties of integration with respect to 4 then tell us that 


m _ m A _ m A 
(A n= (4 (- djs 00) oh= fa du (d). 


We then extend ug to R by setting it equal to zero on R\o(A), establishing 
the existence of the desired probability measure on R. Since 


eb, AY] <I? A SIP A” 
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the moments grow only exponentially with m. Thus, standard uniqueness 
results for the moment problem (e.g., Theorem 8.1 in Chap. 4 of [18]) give 
the uniqueness of ps. | 


7.3 Spectral Theorem for Bounded Self-Adjoint 
Operators, II 


As we have already noted in Sect. 6.5, one version of the spectral theorem 
asserts that every self-adjoint operator is unitarily equivalent to a multi- 
plication operator. In the case of a bounded self-adjoint operator A, on a 
separable Hilbert space H, this result means that A is unitarily equiva- 
lent to the operator M), on L?(X,), where (X,s) is a o-finite measure 
space, h is a measurable, real-valued function, and Mj; is the operator of 
multiplication by h: 
(Mnwy)(A) = h(A)¥(A). 

Although the “multiplication operator” form of the spectral theorem 
(Theorem 7.20) has the advantage of being easy to state, there is an even 
better version involving the concept of a direct integral. It is straightforward 
to extend the notion of an L? space to an L? space with values in a Hilbert 
space H. In a direct integral, we extend the concept one step further, by 
allowing the Hilbert space to depend on the point. We begin with a measure 
space (X,) and then have one Hilbert space H) for each A in X. An 
element of the direct integral is a function s on X such that s(A) belongs 
to H) for each \ € X. Given a real-valued measurable function h on X, it 
makes sense to multiply an element s of the direct integral by h. 

The direct integral form of the spectral theorem says a bounded self- 
adjoint operator A is unitarily equivalent to a multiplication operator on a 
direct integral. By extending multiplication operators to the more general 
setting of direct integrals (instead of just ordinary L? spaces), we gain sev- 
eral benefits. First, the set X and the function h become canonical: The 
set X is simply the spectrum of A and the function h is simply h(A) = X. 
Second, the direct integral approach carries with it a notion of “generalized 
eigenvectors,” since the space H, can be thought of as the space of gener- 
alized eigenvectors with eigenvalue \. (The spaces H) are not, in general, 
contained in the direct integral Hilbert space. Thus, direct integrals give a 
rigorous meaning to the idea of “eigenvectors” that are not in the Hilbert 
space on which the operator acts.) Third, the direct integral approach gives 
a simple way to classify self-adjoint operators up to unitary equivalence: 
Two self-adjoint operators are unitarily equivalent if and only if their direct 
integral representations are equivalent in a natural sense (Proposition 7.24). 

If one really wants the simplicity of the (ordinary) multiplication operator 
version of the spectral theorem, it is a simple matter to prove this result 
using precisely the same methods as in the proof of the direct integral 
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version. (See Theorem 7.20.) Nevertheless, the direct integral version is, 
arguably, the most definitive version of the spectral theorem for a single 
self-adjoint operator. 

We turn now to the definition of a direct integral. Suppose p is a o-finite 
measure on a o-algebra 2 of sets in X. Suppose also that for each A € X, 
we have a separable Hilbert space Hy with inner product (-,-),. We want 
to define the direct integral of the H’s with respect to w. Elements of the 
direct integral will be sections s, meaning that s is a function on X with 
values in the union of the H)’s, having the property that 


s(A) € Hy 


for each \ in X. We would like to define the norm of a section s by the 
formula 


Isl? = I (s(A), 5(A)) du(), 


provided that the integral on the right-hand side is finite. The inner product 
of two sections s; and s2 (with finite norm) should then be given by the 
formula 


(51,82) = i (s1(d),2(A)), du(). 


The problem with this description of the norm and inner product on 
the direct integral is that we have not said anything about measurability. 
As things stand, it does not make sense to ask whether a section s is 
measurable, since the space in which s(A) takes its values is different for 
each A. We must, therefore, introduce some additional structure that gives 
rise to a notion of measurability. (The measurability issue is a technicality 
that can be ignored on a first reading.) 

One way to address the measurability issue is to choose a simultaneous 
orthonormal basis for each of the Hilbert spaces H). To deal with the 
possibility that different spaces can have different dimensions, we slightly 
modify the concept of an orthonormal basis. We say that a family {e;} of 
vectors is an orthonormal basis for a Hilbert space H if (e;,e,) = 0 for 
j # k, the norm of each e; is either 0 or 1, and the closure of the span 
of the e;’s is all of H. This just means that we allow some of the vectors 
in our basis to be zero, with the nonzero vectors forming an orthonormal 
basis in the usual sense. 

We now define a simultaneous orthonormal basis for a family {H)} of 
separable Hilbert spaces to be a collection {e;(-)}92, of sections with the 
property that for each A, {e;(A)}F2, is an orthonormal basis for H. Pro- 
vided that the function \++> dim H) is a measurable function from X into 
(0, co], it is possible to choose a simultaneous orthonormal basis {e,;(-)} 
such that (e;(A),e%(A)) is measurable for all 7 and k. Having chosen a si- 
multaneous orthonormal basis with this property, we define a section s to 


146 7. The Spectral Theorem for Bounded Self-Adjoint Operators... 


be measurable if the function 
AH (e;(A), 8(A)) y 


is a measurable complex-valued function for each 7. Our assumption on the 
e;’s means that the e;’s themselves are measurable sections. 

We refer to a choice of simultaneous orthonormal basis, chosen so that. 
(e;(A), ex(A)) is measurable, as a measurability structure on the collection 
of H)’s. Given two measurable sections s; and sz, the function 


A+ (51(A), 82(A)), = D> (51), €y(A)) 4 (690), 820) 


j=l 
is also measurable. 


Definition 7.18 Suppose the following structures are given: (1) a o-finite 
measure space (X,Q2,u), (2) a collection {H)}yex of separable Hilbert 
spaces for which the dimension function is measurable, and (3) a mea- 
surability structure on {H)}yex. Then the direct integral of the H)’s 
with respect to 4, denoted 


@ 
| Hy dyi(), 
Ba 


is the space of equivalence classes of almost-everywhere-equal measurable 
sections s for which 


Is? = ff (s(2),s),, dul) < 00. 


The inner product (s1,52) of two such sections s; and sq is given by the 
formula 


(sisse) =f (si(0),80(A)), a0). 


To see that the integral defining the inner product of two finite-norm 
sections is finite, note that |(s1(A), s2(A)),| < |]s1()]ly |]s2Q()||,- By as- 
sumption, ||s;(A)||, is a square-integrable function of \ for 7 = 1,2, and 
the product of two square-integrable functions is integrable. Thus, the inte- 
grand in the definition of (51, 52) is also integrable. It is not hard to show, 
using an argument similar to the proof of completeness of L? spaces, that 
a direct integral of Hilbert spaces is a Hilbert space. 

Let us think of two important special cases of the direct integral con- 
struction. First, if each of the H)’s is simply C, then the direct integral 
(with the obvious measurability structure) is simply L?(X, 1). Second, sup- 
pose that X = {A}, 2,...} is countable, Q is the o-algebra of all subsets 
of X, and yu is the counting measure on X. Then the direct integral is the 
Hilbert space direct sum (Definition A.45). 
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Given a direct integral, suppose we have some \y € X for which {Ao} 
is measurable and such that c := u({Ao}) > 0. Then we can embed H), 
isometrically into the direct integral by mapping each ~ € H), to the 


section s given by 
4. A= 
= yee 0 ; 
ay { 0, AFA 


Even if u({A9}) = 0, we may still think that H), is a sort of “generalized 
subspace” of the direct integral. 


Theorem 7.19 (Spectral Theorem, Second Form) /f A € B(H) is 
self-adjoint, then there exists a o-finite measure 4 on o(A), a direct in- 


tegral 
® 


H) du (A), 
o(A) 


and a unitary map U between H and the direct integral such that 
[UAU~*(s)] (A) = As(A) (7.20) 
for all sections s in the direct integral. 


The proof of Theorem 7.19 is given in the next chapter, along with the 
proof of our first version of the spectral theorem. In the meantime, let us 
think about what this version of the spectral theorem is saying. We may 
think that the unitary map U is an identification of our original Hilbert 
space H with a certain direct integral over the spectrum of A. Under this 
identification, the self-adjoint operator A becomes the operator of multi- 
plication by A, that is, the map sending the section s(X) to As(A). Roughly 
speaking, then, the operator A acts (under our identification) as AJ on 
each space H). Thus, we may think of H) as being something like an 
“eigenspace” for A, for each element A of the spectrum of A. Of course, 
unless u({A}) > 0, the Hilbert space Hy is not actually contained in H. 
Nevertheless, we may think of elements of a given H) as “generalized eigen- 
vectors” for the operator A. 

The direct integral formulation of the spectral theorem leads readily to a 
classification result for bounded self-adjoint operators. See Proposition 7.24 
later in this section. Meanwhile, as we noted earlier in this section, the 
method of proof for Theorem 7.19 also yields a version of the spectral 
theorem involving multiplication operators on ordinary L? spaces. 


Theorem 7.20 (Spectral Theorem, Multiplication Operator Form) 
Suppose A € B(H) its self-adjoint. Then there exists a o-finite measure 
space (X, 41), a bounded, measurable, real-valued function h on X, and a 
unitary map U :H —- L?(X, 1) such that 


[VAU~* (JA) = hOYYO) 
for all b € L?(X, p). 
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We return now to a discussion of the direct integral version of the spectral 
theorem. This version gives a simple description of the functional calculus. 


Proposition 7.21 Suppose A € B(H) is self-adjoint and U is a unitary 
map as in Theorem 7.19. Then for any bounded measurable function f on 
a(A), we have 

[Uf (A)U~*(s)](A) = FA)s(A)- 


Thus, roughly speaking, f(A) is defined to be f(A)J on each “generalized 
eigenspace” H). Proposition 7.21 follows directly from (7.20) if f is a poly- 
nomial; the result for continuous f then follows by taking uniform limits. 
The result for general f is then easily established by using the limiting 
arguments of Chap. 8, especially Exercise 3. 

Let us now consider what sort of uniqueness there should be in the second 
version of the spectral theorem. There is a “trivial” source of nonuniqueness 
coming from the possibility that some of the H)’s may have dimension 0. 
Let Eo denote the set of \ for which dim H) = 0. Even if u(Eo) > 0, the set 
Eo makes no contribution to the norm of a section, since every section is 
automatically zero on Ep. Thus, we may define a new measure ju by setting 
jE) = w(EM E>), so that f agrees with 4 on Ef but is zero on Eo. Then 
the direct integrals of the H’s with respect to ys and with respect to jz are 
“indistinguishable.” Thus, we can always modify a direct integral so as to 
assume that dim H) > 0 for almost every X. 

Meanwhile, unlike the projection-valued measure yi“ in Theorem 7.12, 
the measure p in Theorem 7.19 is not unique, but only unique up to equiva- 
lence, where two o-finite measures on a given measurable space are equiva- 
lent if they have precisely the same sets of measure zero. For a given measure 
pt, the Hilbert spaces H are unique only up to unitary equivalence, mean- 
ing that only the dimension of the spaces is uniquely determined. Even 
the dimension of H) is uniquely determined only up to a set of u-measure 
zero. As it turns out, the sources of nonuniqueness in this paragraph and 
the previous paragraph are all that exist. 


Proposition 7.22 (Uniqueness in Theorem 7.19) Suppose A € B(H) 
is self-adjoint and consider two different direct integrals as in Theorem 7.19, 


one with measure uw and Hilbert spaces HY and the other with mea- 


sure ?) and Hilbert spaces HY), If dim HY? > 0 for p)-almost every X 
(j = 1,2), then ph and yp?) are mutually absolutely continuous and 

dim H\) = dim HH) 
for p-almost every  (j = 1,2). 


See the end of the next chapter for a sketch of the proof of this uniqueness 
result. 

Theorem 7.19 should be thought of as a refinement of our earlier form 
(Theorem 7.12) of the spectral theorem, in the sense that we can easily 
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recover Theorem 7.12 from Theorem 7.19. In the setting of Theorem 7.19, 
and given a measurable set E C a(A), let Vg denote the space of (equiv- 
alence classes) of sections s that are supported on EF, that is, for which 
s(A) = 0 for p-almost every A in E°. This is easily seen to be a closed 
subspace. Let Pz denote the orthogonal projection onto Vz, and define 


pA(E) = U1 Pu. (7.21) 


It is straightforward to check that y4 is a projection-valued measure on 
o(A), with values in B(H), and that Soca) Laue (A) A. 

Note that both versions of the spectral theorem for A involve a measure, 
the first, denoted 4, being a projection-valued measure, and the second, 
denoted jz, being an ordinary measure with values in the non-negative real 
numbers. The following result shows the relationship between the two mea- 
sures. 


Proposition 7.23 Suppose A € B(H) is self-adjoint, 4 is the projection- 
valued measure given by Theorem 7.12 and ys is a real-valued measure as 
in Theorem 7.19. If dimH) > 0 for p-almost every X, then for any Borel 
set E C o(A), p4(E) =0 if and only if u(E) = 0. 


Of course, the 0 in the expression 4(E) = 0 is the zero operator, whereas 

the 0 in the expression p(E) = 0 is the number 0. Nevertheless, we may 
think of Proposition 7.23 as saying that py“ and yp are equivalent in the 
usual measure-theoretic sense, having precisely the same sets of measure 
zero. 
Proof. As we have remarked, given a direct integral as in Theorem 7.19, 
we can construct a projection-valued measure by means of (7.21), and this 
projection-valued measure satisfies Jo A) d duA(\) = A. This projection- 
valued measure must coincide with the one in Theorem 7.12, by the unique- 
ness in that theorem. 

Now, if u(E) = 0, then any section supported on £ is zero almost every- 
where and thus represents the zero element of the direct integral. In that 
case, Ve = 0 and so p4(E) = 0 by (7.21). In the other direction, suppose 
p(E) > 0. Since p is o-finite, E will contain a measurable subset F such 
that 0 < p(F’) < oo. Then let s be the section given by 


1 
)= So get 
j=l 


for \ € F and s(\) = 0 for \ € F°%, where {e,;(-)} is our measurability 
structure for the direct integral. Then 
1 
(s(A), e7(A)), = 57 (e7), 7), LeQ), 


which is a measurable function of 2 for all 7, so that s is measurable. Since 
we assume that H) has nonzero dimension for p-almost every A, s will be 
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nonzero almost everywhere on F and thus will have positive norm. The 
norm of s is finite because ||s(A)|| < 1 and F has finite measure. Thus, 
Ve 40 and p4“(E) 40. » 

We say that self-adjoint operators A, and Ag on Hilbert spaces H, and 
Hg are unitarily equivalent if there exists a unitary map U : H,; — He 
such that 

Ag =UA, Us, 


Using Proposition 7.22, we can give a classification of bounded self-adjoint 
operators on separable Hilbert spaces up to unitary equivalence. For a given 
bounded self-adjoint operator A, we call the function A’ +> dimH) the 
multiplicity function for A. It is well defined (independent of the choice of 
direct integral decomposition) up to a set of measure zero. It turns out that 
bounded self-adjoint operators are characterized, up to unitary equivalence, 
by the spectrum of A as a set, the equivalence class of the measure jz in 
Theorem 7.19, and the multiplicity function. 


Proposition 7.24 Suppose A, and Ag are bounded self-adjoint operators 
on separable Hilbert spaces Hy and Ho, respectively. Choose direct integral 
representations for A, and Ag as in Theorem 7.19, with the associated 
measures [41 and jig chosen so that dimH) > 0 for p1;-almost every 
(j = 1,2). Then A, and Ag are unitarily equivalent if and only if the 
following conditions are satisfied. 


re a(A1) = a(A2). 
2. The measures 1 and 2 are mutually absolutely continuous. 


8. The multiplicity functions of Ay and Ag coincide up to a set of mea- 
sure zero. 


See Exercise 12 for a proof of this result. 


7.4. Exercises 


1. Suppose A and B are commuting linear operators on a nonzero finite- 
dimensional vector space. 


(a) Show that each eigenspace for A is invariant under B. 


(b) Show that A and B have at least one simultaneous eigenvector, 
that is, a nonzero vector v with Av = Xv and Bu = pv, for some 
constants A, € C. 


2. Suppose that A € B(H) is normal, meaning that AA* = A*A. Sup- 
pose that for some = € H and A € C we have Aw = Aw. Show that 
A*y = A. 

Hint: Compute || (A* - Jv]. 


3. 


4. 
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Suppose a closed subspace V of H is invariant under a bounded oper- 
ator A, meaning that Aw € V for all y € V. Show that the orthogonal 
complement V+ of V is invariant under A*. 


(a) Suppose that H is a finite-dimensional Hilbert space over C and 
A is a normal linear operator on H in the sense of Exercise 2. 
Show that there exists an orthonormal basis for V consisting of 
simultaneous eigenvectors for A and A*. 


Hint: Use Exercises 1 and 3. 


oc 
~ 


Suppose A is a linear operator on a finite-dimensional Hilbert 
space H over C and suppose there exists an orthonormal basis 
for V consisting of eigenvectors of A. Show that A commutes 
with A*. 


. Suppose A € B(H) has an inverse A~! in B(H). Show that (A~!)* A* 


= A*(A~!)* = I. Conclude that A* is invertible and (A*)~!=(A7?!)*. 


. Suppose U is a unitary operator on H (Definition A.55). Show that 


the spectrum of U is contained in the unit circle. 


Hint: By writing U — AI as (—A)(I — U/X) or as U(I — AU—"), show 
that any A with |A| 4 1 is in the resolvent set of A. 


. Suppose that A € B(H) is self-adjoint and non-negative, that is, that 


A satisfies (7.3). Show that the spectrum of A is contained in the 
interval [0, co). 

Note: Conversely, if A € B(H) is self-adjoint and o(A) C [0, 00), then 
A is non-negative. See Exercise 2 in Chap. 8. 


. Suppose A € B(H) is invertible. Show that there exists ¢ > 0 such 


that for all B € B(H) with ||B-— Al] < «, B is also invertible. 


Hint: Use a power series argument as in the proof of Proposition 7.5. 


. Assume A € B(H) is self-adjoint. 


(a) Suppose Ag € C is a point in the resolvent set of A. Show that 


1 
d(Ao, a(A))’ 
where d(Ao, 0(A)) = infyeocay |A — Aol- 


Hint: Think of (A — AoZ)~! as a function of A in the sense of 
the functional calculus for A. 


(A= do = 


(b) Given Ao € C, suppose that there exists some nonzero ~) € H 
such that 
| Ad — Aovll < e|lvI|- 


Show that there exists \ € a(A) such that |A — Ao| < e. 
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10. 


11. 


12. 
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Suppose V, and V2 are two closed subspaces of H, with associated 
orthogonal projections P; and P;. Show that V; and V2 are orthogonal 
if and only if P,P, = 0. 


Suppose yz is a projection-valued measure on (X,2). Show that for 
any Ey, E2 € Q, u(E1)u(2) is the projection onto the closed sub- 
space Range(u(F1)) MN Range(p(E2)). 


Hint: Write E, as Ey = (F, 9 Ep) U (£1\E2) and use Exercise 10. 


Prove Proposition 7.24. 


Hint: Use Proposition 7.22 and the Radon—Nikodym theorem 
(Theorem A.6). 


8 


The Spectral ‘Theorem for Bounded 
Self-Adjoint Operators: Proofs 


In this chapter we give proofs of all versions of the spectral theorem stated 
in the previous chapter. 


8.1 Proof of the Spectral Theorem, First Version 


A proof of the spectral theorem, in its projection-valued measure form, can 
be obtained in two main stages. The first stage of the proof is to define a 
continuous functional calculus, meaning we associate with each continuous 
function f on o(A) an operator f(A). The map f + f(A) should have the 
property that if f is the function f(A) = A, then f(A) = A™. The contin- 
uous functional calculus is then constructed by approximating continuous 
functions on o(A) by polynomials. The Stone—Weierstrass theorem tells us 
that polynomials are dense in the continuous functions on o(A); it remains 
only to show that if a sequence p, of polynomials converges uniformly to 
some continuous function f on o(A), then the operators p,(A) converge to 
some operator, which we will then call f(A). 

The second stage of the proof is to show that the continuous functional 
calculus can be represented as integration against a projection-valued mea- 
sure. This result is just an operator-valued version of the Riesz represen- 
tation theorem from measure theory (Theorem 8.5). Indeed, we will see 
that this operator-valued version of the Riesz representation theorem can 
be reduced to the usual form of the theorem. 
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8.1.1 Stage 1: The Continuous Functional Calculus 
We begin by defining, for any A € B(H), the spectral radius R(A) by 


R(A) = sup ||. 
AEo(A) 


(By Propositions 7.5 and 7.7, a(A) is a nonempty, bounded subset of R.) 
According to Point 2 of Proposition 7.5, we have 


R(A) < |All 


for any A € B(H). In general, || Al] can be much bigger than R(A). For ex- 
ample, if A is a nilpotent matrix, then R(A) = 0 but ||A|| can be arbitrarily 
large. 


Lemma 8.1 /f A © B(H) is self-adjoint, the norm and the spectral radius 
of A are equal: 
|| All = R(A). 


In preparation for the proof, we determine the radius of convergence of 
the power series for the resolvent given in the proof of Proposition 7.5. 
According to Proposition 7.2, we have 


A" Al] = AI? 
for any A € B(H). If A is self-adjoint, we obtain 
47 = AIF. 
Iterating this relation gives 


42" = 141?” (8.1) 








for all n. 
Consider, for a bounded self-adjoint operator A, the following formal 
expression for the resolvent of A: 


=-»> ae (8.2) 
m=0 


If |A| > ||Al], then the proof of Proposition 7.5 shows that the series (8.2) 
converges in the operator norm topology and that the sum of the series is 
indeed the inverse of (A — AJ). If, on the other hand, || < ||A]|, it follows 
from (8.1) that the norms of the terms in (8.2) do not tend to zero, and 
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so the series cannot converge in the operator norm topology. We may say, 
then, that the series (8.2) has radius of convergence equal to || A||. 

Proof of Lemma 8.1. We know that R(A) < ||Al||. To show that R(A) = 
|| Al], we wish to argue that (A — AI)~+ is a holomorphic operator-valued 
function of A on the set |A| > R(A), and therefore the Laurent series 
of (A — AI)" must converge for |\| > R(A). But the Laurent series of 
(A — \I)~+ is just the series in (8.2), and we have shown that the series 
diverges when |A| < ||A||. This would be a contradiction if R(A) were less 
than || A||. 

To flesh out the argument, recall the formula (7.8) in the proof of Propo- 
sition 7.5 for the resolvent of A. 

That formula expresses the map \++ (A — \J)~! as a convergent power 
series in powers of A — Ag, near any point Ao in the resolvent set of A. It 
follows that for any bounded linear functional € € B(H)*, the complex- 
valued function 


AH E((A— AD) 


is holomorphic on the resolvent set of A. This function has a unique Laurent 
series, which is given by applying € term by term to (8.2). The series will 
converge on the largest annulus contained in the resolvent set of A, namely 
the set of A with |A| > R(A). 

Convergence of (8.2) means that |€(A™/X'*1)| is bounded as function 
of m, for each € and each A with |A| > R(A). Thus, by (a corollary of) the 
uniform boundedness principle (Appendix A.3.4), the set {A™/A™t1}°°_) 
is bounded in the Banach space 6(H), for all X with |A| > R(A). In par- 
ticular, for each with |A| > R(A), there is a constant C' such that 


a?" _ AIP 


= — <C. 
Al? Al? 








If || Al| were greater than R(A), this inequality would be false for \ satisfying 
R(A) <|A| < |All. m 

The next key step in Stage 1 of the proof is to understand how the 
spectrum of a self-adjoint operator transforms under application of a poly- 
nomial. 


Lemma 8.2 (Spectral Mapping Theorem) For all A € B(H) and ail 
polynomials p, we have 


That is to say, the spectrum of p(A) consists precisely of the numbers of 
the form p(A), with in the spectrum of A. 
Proof. The result is trivial if p is constant. When deg p > 1, let p given by 


p(z) = Gn2" + Ce spss? Og 
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be an arbitrary polynomial. We first show that p(o(A)) C a(p(A)). 
Suppose, then, that A € a(A). Observe that 


p(A) — p(A)T = an(A” — AT) + Gn—1(A"™* — AMT) + + aol = aol. 
Now, 
A® — \*T = (A—AI)(AB* + AAR? + VARS + «PTD, 


Thus, we can pull out a factor of (A — AZ) from each nonzero term in 
P(A) — p(A)I, giving 


P(A) — p(T = (A— AT)q(A) 


where gq is a polynomial (depending on \). Since, by assumption, A — AI is 
not invertible, and since (A— AJ) commutes with q(A), (A—AI)q(A) cannot 
be invertible (Exercise 1). This shows that p(X) belongs to the spectrum of 
p(A). 

We now show that a(p(A)) C p(o(A)). Suppose, then, that y € a(p(A)). 
Since C is algebraically closed, we can factor the polynomial p(z) — 7, as a 
function of z, as 


p(z) —y = cz — b1)(z — bg) +--+ (2 — by). (8.3) 


Thus, 
p(A) — 7 = c(A— bi I)(A = bal) (A—},F). 


Since p(A) — 7I is assumed to be noninvertible, there must be some j such 
that (A — };J) is noninvertible, that is, for which b; € o(A). Then (8.3) 
tells us that p(b;) — y = 0, meaning that 7 = p(b;). Thus, 7 is of the form 
p(A) for some A (= 6;) in o(A). @ 

The last step in Stage 1 of our proof is to apply the Stone—Weierstrass 
theorem to show that polynomials are dense in C(o(A);R) (the space of 
continuous, real-valued functions on o(A)) with respect to the supremum 
norm. 


Proposition 8.3 Suppose A € B(H) is self-adjoint. Then there exists a 
unique bounded linear map from C(o(A);R) into B(H), denoted by f 
f(A), such that when f(A) = A, we have f(A) = A™. The map f + f(A), 
f € C(o(A);R), is called the (real-valued) functional calculus for A. 


Proof. Note that if A is self-adjoint, then p(A) is self-adjoint provided 
that p is a real-valued polynomial (i.e., one where all the coefficients are 
real numbers). Thus, combining the spectral mapping theorem with the 
equality of the norm and spectral radius, we have the following: If A is a 
self-adjoint operator and p is a real-valued polynomial, then 


IIp(A)l| = sup |p()]- 
AEa(A) 


(8.4) 
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Thus, the map p > p(A) is an isometric linear map from the space of 
polynomials on o(A) (with the supremum norm) into the space of bounded 
operators on H. 

According to the Stone-Weierstrass theorem polynomials are dense in 
C(o(A);R). Thus, by the BLT theorem (Theorem A.36), we can extend the 
map p++ p(A) uniquely to a bounded linear map of C(o(A);R) into B(H). 
rT 


Proposition 8.4 If A € B(H) is self-adjoint, the (real-valued) continuous 
functional calculus for A, mapping C(o(A);R) into B(H), has the following 
properties. 


1. Multiplicativity: For all f,g, we have 
(f9)(A) = f(A)g(A), 
where fg denotes the pointwise product of f and g. 
2. Self-adjointness: For all f, the operator f(A) is self-adjoint. 


3. Non-negativity: For all f, if f is non-negative, then f(A) is a non- 
negative operator. 


4. Norm and spectrum properties: For all f, we have 


f(A) = sup |f()| (8.5) 
AE a(A) 
and 
o(f(A)) = {F0)|A € ofA)}. (8.6) 


Proof. Point 1 holds for polynomials and thus, by taking limits, for all 
f € C(o(A);R). Furthermore, if p is a real-valued polynomial and A is 
self-adjoint, then p(A) is self-adjoint. From this, we get Point 2 by taking 
limits. If f € C(o(A);R) is non-negative, then f = g?, where g = \/f is 
real-valued. Thus, g(A) is self-adjoint and for all ~ € H, Point 1 tells us 
that 


(, f(A)d) = (2b, g(A)v) = (g(A)Y, g(A)Y) 2 0, (8.7) 


which establishes Point 3. We have already established (8.5) in (8.4) for 
polynomials; the result for general f € C(o(A);R) follows by taking limits. 

To establish (8.6), suppose first that A) € C is not in the range of f. 
Then the function g(A) := 1/(f(A) — Ao) is continuous on o(A) and the 
operator g(A) will be the inverse of f(A) — Aol, showing that Ao is not in 
the spectrum of f(A). 

In the other direction, suppose that Ay = f(j) for some pu € a(A); we 
want to show that f(s) € o(f(A)). Suppose now that f(A) — f(u) were 
invertible and choose a sequence p, of polynomials converging uniformly 
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to f on o(A). By Exercise 8 in Chap. 7, any operator sufficiently close to 
f(A) — f(u)L in the operator norm topology would also be invertible. In 
particular, p,(A) — pp(j4)Z would have to be invertible for all sufficiently 
large n, contradicting the spectral mapping theorem. @ 


8.1.2 Stage 2: An Operator- Valued Riesz Representation 
Theorem 


We turn now to Stage 2 of the proof of the spectral theorem. We will make 
use of the Riesz representation theorem from measure theory (not the result 
about continuous linear functionals on a Hilbert space). The following form 
of this result is sufficient for our purposes. 


Theorem 8.5 (Riesz Representation Theorem) Let X be a compact 
metric space and let C(X;R) denote the space of continuous, real-valued 
functions on X. Suppose A : C(X;R) > R is a linear functional with the 
property that A(f) is non-negative whenever all the values of f are non- 
negative. Then there exists a unique (real-valued, positive) measure ju on 
the Borel o-algebra in X for which 


a= [fay 


for all f © C(X;R). 


See pp. 353-354 of Volume I of [34] for a short proof in the case in which 
X is acompact subset of R, which is all we really require. For the full result 
stated above, see Theorems 7.2 and 7.8 in [12]. Observe that y is a finite 
measure, with 4(X) = A(1), where 1 is the constant function. 

Given a bounded self-adjoint operator A € B(H), we have constructed, 
in the previous subsection, a continuous functional calculus for A. This 
calculus is a map, denoted f +> f(A), from C(o(A);R) into 6(H). If f € 
C(a(A);R) is non-negative, then (Point 3 of Proposition 8.4) f(A) is a non- 
negative operator. Thus, given w € H, if we define a linear functional Ay, 
on C(a(A);R) by the formula 


Aul(f) = ( F(A)Y), 


Ay will satisfy the hypotheses of the Riesz representation theorem. Thus, 
for each 7% € H, we obtain a unique measure 4, such that 


FAW = fF) deal (3.8) 


for all f € C(o(A);R). Note that 


puy(o(A)) = Ay(1) = [vl (8.9) 
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Definition 8.6 If f is a bounded measurable (complex-valued) function on 
o(A), define a map Q : H— C by the formula 


Q;(b) = / yf) 00) 


where pty, 1s the measure in (8.8). 


If f happens to be real valued and continuous, then Q;(~) is equal 
(w, f(A)w), in which case Q> is a bounded quadratic form. (See Defini- 
tion A.60 and Example A.62.) It turns out that Qf is a bounded quadratic 
form for any bounded measurable f, in which case Proposition A.63 allows 
us to associate with Q» a bounded operator, which we denote by f(A). 
Once the relevant properties of f(A) are established, we will construct the 
desired projection-valued measure by setting u“(E) = 1p(A). 


Proposition 8.7 For any bounded measurable function f on o(A), the 
map Qy in Definition 8.6 is a bounded quadratic form. 


Proof. Let F denote the space of all bounded, Borel-measurable func- 
tions f for which Qf is a quadratic form. Then F is a vector space and 
contains C(o(A);R). Furthermore, F is closed under uniformly bounded 
pointwise limits, because Q;(w) is continuous with respect to such limits, 
by dominated convergence. Standard measure-theoretic techniques (Exer- 
cise 3) then show that F is the space of all bounded Borel-measurable 
functions on X. 
Meanwhile, it follows from (8.9) that 


IQr()| <_ sup [fQ)] [oI 
AEa(A) 


showing that Q, is always a bounded quadratic form. m 


Definition 8.8 For a bounded measurable function f on o(A), let f(A) be 
the operator associated to the quadratic form Q + by Proposition A.63. This 
means that f(A) is the unique operator such that 


(, FAW) = @7(4) = : i te 


for ally © H. 


Observe that if f is real valued, then Q;(q) is real for all ~ € H, which 
means (Proposition A.63) that the associated operator f(A) is self-adjoint. 
We will shortly associate with A a projection-valued measure py“, and we 
will show that f(A), as given by Definition 8.8, agrees with f(A) as given 
by Jaca) fA) du (X). [See (8.10) and compare Definition 7.13.] 
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Proposition 8.9 For any two bounded measurable functions f and g, we 


have 
(f9)(A) = F(A)g(A). 


Proof. Let #; denote the space of bounded measurable functions f such 
that (fg)(A) = f(A)g(A) for all g € C(o(A); R). Then F; is a vector space 
and contains C(a(A);IR). We have already noted that dominated conver- 
gence guarantees that the map f + Q;(w), w © H, is continuous un- 
der uniformly bounded pointwise convergence. By the polarization identity 
(Proposition A.59), the same is true for the map f ++ L¢(¢,w), where Ly is 
the sesquilinear form associated to Qr. Now, by the polarization identity, f 
will be in F, provided that 


(b, (F9)(A)Y) = (eb, F(A) g(A)Y) 


or, equivalently, 
Qra(v) = Lev, g(A)Y) 


for all » € H and all g € C(o(A);R). From this, we can see that F, is 
closed under uniformly bounded pointwise limits. Thus, by Exercise 3, Fi 
consists of all bounded, Borel-measurable functions. 

We now let F2 denote the space of all bounded, Borel-measurable func- 
tions f such that (fg)(A) = f(A)g(A) for all bounded Borel-measurable 
functions g. Our result for F, shows that F2 contains C(o(A);R). Thus, 
the same argument as for F, shows that F2 consists of all bounded, Borel- 
measurable functions. 


Theorem 8.10 Suppose A € B(H) is self-adjoint. For any measurable set 
E Cc o(A), define an operator p4(E) by 


w*(E) = 12(A), 


where 1p(A) is given by Definition 8.8. Then pA is a projection-valued 
measure on o(A) and satisfies 


‘) d dpAQ) = A. 
a(A) 


Theorem 8.10 establishes the existence of the projection-valued measure 
in our first version of the spectral theorem (Theorem 7.12). 
Proof. Since 1g is real-valued and satisfies 1g - 1g = 1p, Proposition 8.4 
tells us that 1~(A) is self-adjoint and satisfies 1_(A)? = 1,(A). Thus, 
u4(E) is an orthogonal projection (Proposition A.57), for any measurable 
set EF C X. If FE, and FE, are measurable sets, then lz,ng, = lz, - le, 
and so 

wA(E, OE) = uA (Ey) u4 (Ea). 


If E,, E2,... are disjoint measurable sets, then w4(E;)u4(E,)="4(2)=0, 
for j # k, and so the ranges of the projections u4(E;) and y4(E,) are 
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orthogonal. It then follows by an elementary argument that, for all w € H, 


we have 
[oe) 


DWE: = Pe 


where the sum converges in the norm topology of H and where P is the 
orthogonal projection onto the smallest closed subspace containing the 
range of y“(E;) for every j. On the other hand, if BE := U32, ;, then 
the sequence fy := ae 1g, is uniformly bounded (by 1) and converges 
pointwise to 1g. Thus, using again dominated convergence in (8.8), 


Jim RCo lz, ( w) = (~, 1p (A)y). 


It follows that 1,(A) coincides with P, which establishes the desired 
countable additivity for 4. 
Finally, if f = 1, for some Borel set EF, then 


/ f(A) dud) = f(A), (8.10) 
a(A) 


where f(A) is given by Definition 8.8. [The integral is equal to u4(E), which 
is, by definition, equal to 1,_(A).] The equality (8.10) then holds for simple 
functions by linearity and for all bounded, Borel-measurable functions by 
taking limits. In particular, if f(A) = A, then the integral of f against p4 
agrees with f(A) as defined in Definition 8.8, which agrees with f(A) as 
defined in the continuous functional calculus, which in turn agrees with 
f(A) as defined for polynomials—namely, f(A) = A. This means that 


i d duA(dA) =A 
o(A) 


as desired. m 

We have now completed the existence of the projection-valued measure 
u4 in Theorem 7.12. The uniqueness of 1“ is left as an exercise (Exercise 4). 
We close this section by proving Proposition 7.16, which states that if a 
bounded operator B commutes with a bounded self-adjoint operator A, 
then B commutes with f(A), for all bounded, Borel-measurable functions 
f on o(A). 
Proof of Proposition 7.16. If B commutes with A, then B commutes 
with p(A), for any polynomial p. Thus, by taking limits as in the construc- 
tion of the continuous functional calculus, B will commute with f(A) for 
any continuous real-valued function f on (A). We now let F denote the 
space of all bounded, Borel-measurable functions f on o(A) for which f(A) 
commutes with B, so that C(a(A);R). 
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To show that a bounded measurable f belongs to f, it suffices to show 
that for all ¢,v € H we have (¢, f(A) Bw) = (¢, Bf (A)v), or, equivalently, 
(6, f(A) Bu) = (B64, f(A)b). That is, we want 


L;(¢, By) = L;(B*¢,y). 


But we have seen that for fixed vectors ~1, ~2 € H, the map f +> Ly (1, W2) 
is continuous under uniformly bounded pointwise limits. Thus, F is closed 
under such limits, which implies (Exercise 3) that F contains all bounded, 
Borel-measurable functions. @ 


8.2 Proof of the Spectral Theorem, Second Version 


We now turn to the proof of Theorem 7.19. As in the proof of Theorem 7.12, 
we will make use of continuous functional calculus for a bounded self-adjoint 
operator A and the Riesz representation theorem. We begin by establishing 
the special case in which A has a cyclic vector, that is, a vector ~ with 
the property that the vectors A*y, k = 0,1,2,..., span a dense subspace 
of H. In that case, the direct integral will be simply an L? space (i.e., the 
Hilbert spaces H are equal to C for all A). Thus, in this special case, the di- 
rect integral and multiplication operator versions of the spectral theorem 
coincide. 


Lemma 8.11 Suppose A € B(H) is self-adjoint and wy is a cyclic vector 
for A. Let uy be the unique measure on o(A), given by Theorem 8.5, for 
which 


(fA) = f £0) deg) (8.11) 
a(A) 
for all f © C(a(A);R). Then there exists a unitary map 
U :H-> 1?(0(A), 1s) 


such that 
[UAU~*4] (A) = Ad) 


for all 6 € L*(a(A), uy). 


Proof. We start by defining U on the complex vector space of vectors of 
the form p(A)w, where p is a complex-valued polynomial, as follows: 


U[p(A)y] = p. 


To show that U is well defined, write p as p = p; + ip2, where p; and p2 
are real-valued polynomials. Since p;(A) and p2(A) are self-adjoint and 
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commuting, we obtain 
(p(A)eb, p(A)b) = (2b, [p1 (A)? + p2(A)*] 2) 
= I, [pi (A)? + p2(A)?] dy (A), (8.12) 


by canceling cross terms and applying (8.11). Thus, if p(A)w = 0 in H, 
then p(A) = 0 for jzy-almost every A in o(A), so that p represents the zero 
element of L?(o(A), fp). 

Equation (8.12) shows also that the map U is isometric on its initial 
domain. This initial domain is dense in H since it contains the vectors 
Ak and w is cyclic. Thus, the BLT theorem (Theorem A.36) tells us that 
U extends uniquely to an isometric map of H into L?(o(A), Wy). Since 
polynomials are dense in L*(o(A), 4) (by the Stone-Weierstrass theorem 
and Theorem A.10), U actually is unitary. 

Now, since U takes A*q to the function A +> A* in L?(o(A), wy), we 
have that UAU-1(\*) = A*+1. Thus, 


[U AU~*p](A) = Ap) 


for all polynomials p. Since polynomials are dense in L?(o(A), fy), we have 
[UAU~'¢](A) = Ad(A) for all ¢ € L?(o(A), wy), as claimed. m 


Lemma 8.12 Suppose A € B(H) is self-adjoint and yp“ is the associated 
projection-valued measure on o(A), as in Theorem 8.10. Then there exists 


a non-negative real-valued measure ts on o(A) such that for all Borel sets 
E co(A), we have p4A(E) = 0 if and only if u(E) = 0. 


Proof. Let {e;} be an orthonormal basis for H and let pie, be the associated 
real-valued measures, given by jle,(E) = (e;, u4(E)e;). Then pe, (a(A)) = 
(e;,[e;) =1 for all 7. Thus, the formula 


1 
= ys panes 
J 


defines a finite measure on o(A). Given some Borel set FE C o(A), if 
wA(E) = 0, then pe,(E) = 0 for all j and so y(E) = 0. Conversely, if 
u(E) = 0, then 


0 = (e;,u*(E)e;) = (uA(E)e;, 4 (E)e;) 
for all j, since w4(£) is self-adjoint and p4(E)? = pA(E). Thus, u4(E)e; = 
0 for all 7, which means that y4(E£) = 0. = 


Lemma 8.13 If A € B(H) is self-adjoint, then H can be decomposed as 
an orthogonal direct sum of closed nonzero subspaces W;, where each W; is 
invariant under A and where the restriction of A to W; has a cyclic vector 
w;. The number of W;’s is either finite or countably infinite. 
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Proof. Recall our standing assumption that H is separable, and let {¢;} 
be a countable dense subset of H. Let W, be the closed subspace of H 
spanned by ¢), Adi, A?@1, .... Then W, is invariant under A and a, := ¢1 
is a cyclic vector for Alw,- If W, = H then we are done. If not, let 7 be 
the smallest number such that @; is not contained in W;. Let wz be the 
orthogonal projection of ¢; onto the orthogonal complement of W1, and let 
W2 be the closed span of w2, Aw2, A?W2, .... Then W is invariant under A 
and w» is a cyclic vector for Al y,,. Furthermore, since A is self-adjoint and 
leaves W, invariant, it also leaves Wj> invariant, which means that AP uy 
is orthogonal to W, for all k, so that W 2 is orthogonal to W;. 

If, now, W, 6 W2 = H, we are done. If not, we let k be the smallest 
number such that ¢, is not in W, 6 W2 and we let w3 be the projection 
of ¢z onto the orthogonal complement of W; 6 W2, and so on. Continuing 
on in this way, we obtain an orthogonal collection of closed subspaces that 
are invariant under A, each of which has a cyclic vector. Either the process 
terminates with finitely many of these subspaces spanning H, or we get an 
infinite family. In the latter case, each @; belongs to the span of the W;’s 
and hence the (Hilbert space) direct sum of the W;’s is all of H. m= 

We are now ready for the proof of our second form of the spectral theo- 
rem. 

Proof of Theorem 7.19. Let {W;,w,;} be as in Lemma 8.13, and let A, 
denote the restriction of A to W;, which is a bounded self-adjoint operator 
on the Hilbert space W;. For each A;, we can obtain a unitary map U; as in 
Lemma 8.11, and we wish to piece these maps together for different values 
of 7 to obtain a direct integral decomposition for all of H. To facilitate 
piecing the maps together, we will modify the U;’s so that they all map to 
L? spaces over a subset of o(A) with respect to the same measure 1. 

If we apply Lemma 8.11 to Aj, we get a unitary map 


U; : W; > L?(o(A;), by,) 


such that U; AU - is the operator of multiplication by A. Here, py, is the 
measure on o(A;) given by py, (£) = (aby, was (E));). Now, according to 
Exercise 5, the spectrum of A; is contained in the spectrum of A. Fur- 
thermore, if E is a measurable subset of o(A;) C o(A), then 1g may be 
thought of as a measurable function either on o(A;) or on o(A). Exercise 5 
tells us that 1_(A;), as defined by the functional calculus for A;, coincides 
with the restriction to W; of 1m(A). Thus, if 1g(A) = 0 then 12(A;) = 0 
as well. Equivalently, if w4(£) = 0 then u4i(E£) = 0, where 4 is the 
projection-valued measure associated to the self-adjoint operator A;. 

Let us now choose a measure py as in Lemma 8.12. Any set of measure 
zero for ys is a set of measure zero for “4 and thus also for wu“) and then 
for [y,. Thus, if we extend jy, to a measure on o(A) by making it zero on 
o(A) \o(A;), we have that py, is absolutely continuous with respect to pu. 
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By the Radon—Nikodym theorem (Theorem A.6), each ji, has a density 
p; with respect to yz, and this density is nonzero jy,-almost everywhere. 
Now, the map 


fro pif 


is easily seen to be a unitary map of L?(a(Aj), fry, ) to L?(a(A;), w). Thus, 
we can define a unitary map 


U; : W; > L?(o(A;), ) 


by setting . 
(Uh)(A) = p3(A)/? (Uj) ). 


Since multiplication by (p;)'/? commutes with multiplication by \, we have 
(&4j057) (WA) = WO). 


Now, L?(o(A;), 4) can be thought of as a direct integral over o(A) with 
respect to 4, where we take Hi = C for \ € o(A;) and we take Hi = {0} 
if A € o(A,;)°. We now define another direct integral over o(A) in which 
the Hilbert spaces Hy, A € o(A), are defined by 


H) = PH. 
J 


Here the measurable structure on the direct integral is defined by setting 


e;(A) = { 


where the 1 is in the jth slot. Since each H) is a direct sum of the Hi’s, 
the direct integral of the H’s is the Hilbert space direct sum of the direct 
integral of the H4’s, which is just L?(o(A;), 1). 

Meanwhile, H is the direct sum of the W;’s, and we have unitary maps 
U; of W; to L?(o(A;), 2) such that U;AU;" is just multiplication by » on 
L?(E;,). Thus, we can assemble the U;’s into a single unitary map U of H 
to the integral of the Hy’s, and we will have U AU! equal to multiplication 
by A, as desired. m 

In the interest of brevity, we will not give a complete proof of Proposi- 
tion 7.22 (uniqueness in Theorem 7.19), but only indicate the main ideas. 
To establish the equivalence of 4“) and py), we observe that both mea- 
sures have the same sets of measure zero as the projection-valued measure 
uA (Proposition 7.23). Meanwhile, if we have two different direct integrals, 
each unitarily equivalent to H as in (7.20), then there will be a unitary 
map V between the two direct integrals that commutes with the opera- 
tor s(A) + As(A). Using an argument similar to that in Exercise 7, we 


0,0,...,1,0,0,...), AEB; 
0,0 0,0,... 


(00a. yA 00 
(0,0,...,0, ,0, Js AG He? 
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can show that there must be bounded maps Vj : HY > HY? such that 
(Vs)(A) = Vys(A) for almost every \. Then we argue that the only way 
V can be unitary is if V) is unitary for almost every A. This implies that 
dim H() = dim HY for almost every A. 

Finally, we briefly indicate the proof of the multiplication operator form 
of the spectral theorem. 
Proof of Theorem 7.20. Let W; be as in Lemma 8.13 and let A; be the 
restriction of A to W;. By the proof of Theorem 7.19, each A; is unitarily 
equivalent to multiplication by A on the Hilbert space L?(o(A;), uj), for 
some finite measure yj; on o(A;). Let X be the disjoint union of the sets 
o(A;), let 4 be the sum of the measures yu;, and let h be the function 
whose restriction to each o(A;) is the function \ ++ A. Then L?(X, p) is 
the orthogonal direct sum of the Hilbert spaces L?(o(A;), jj), which means 
that L*(X, 1) may be identified unitarily with H = @W; in an obvious way. 
Under this identification, the operator A corresponds to multiplication by h. 
rT] 


8.3. Exercises 


1. (a) Suppose A,B € B(H) commute and A is not invertible. Show 
that AB is not invertible. 


Hint: First show that if AB were invertible, then A would have 
both a left inverse and a right inverse. Then show that the left 
inverse and right inverse would need to be equal. 


(b) Show that the result of Part (a) is false if we omit the assumption 
that A and B commute. 


2. (a) Suppose A € 6(H) is self-adjoint and o(A) C [0, co). Show that 
A has a self-adjoint square root in B(H) and therefore that A is 
a non-negative operator (i.e., (w, Aw) > 0 for all % € H). 


(b) Give an example of a bounded operator A on a Hilbert space 
such that o(A) C [0,0o) but A is not non-negative. 


3. Let X be a compact metric space and let C(X;R) denote the space 
of continuous real-valued functions on X. Suppose that F is a set of 
bounded, measurable, complex-valued functions on X with the fol- 
lowing properties: (1) F is a complex vector space, (2) F contains 
C(X;R), and (3) F is closed under pointwise limits of uniformly 
bounded sequences. (A sequence f,, is uniformly bounded if there 
exists a constant C such that |fp(x)| <C for all n and 2). 


(a) Let Lo denote the collection of those measurable sets F for which 
lz is a uniformly bounded limit of a sequence of continuous 
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functions. Show that £o is an algebra and contains all open sets 
in X. 

(b) Let £, denote the collection of all measurable sets in E for 
which 1g belongs to F. Using the monotone class lemma (The- 
orem A.8), show that £1 consists of all Borel sets in X. 


(c) Show that F consists of all bounded, Borel-measurable functions 
on X. 


4. Suppose A € B(H) is self-adjoint 4 and v4 are two projection- 
valued measures on o(A) such that 


d dpA(r) =) d dv4(A) = A. 
o(A) o(A) 


Show that integration with respect to 4 agrees with integration with 
respect to v, first on polynomials, then on continuous functions, and 


finally on bounded measurable functions. Conclude that 4 = v4. 


Hint: Use Exercise 17. 


5. Suppose A € B(H) is self-adjoint operator and V is a closed subspace 
of H that is invariant under A. 


(a) Using Proposition 7.7, show that the spectrum of the restriction 
to V of A is contained in the spectrum of A. 


(b 


a 


Suppose now that f is a bounded measurable function on o(A), 
which means that f is also a function on a (Al,,-) C o(A). Show 
that V is invariant under f(A) and that 


f(A)ly = f (Aly), 


where the operator on the right-hand side is defined by the 
measurable functional calculus for the bounded self-adjoint op- 
erator Al,,. 


6. Suppose A € B(H) is self-adjoint and wv is an eigenvector for A, that 
is, a nonzero vector with Aw = Aw for some A € R. Show that for 
any bounded measurable function f on o(A) we have 


f(A)b = fO)Y. 
Hint: Use Exercise 5. 


7. Suppose K C R is a compact set and yp is a finite measure on K. Let 
A be the bounded operator on L?(K, 1) given by 


(Ad)(A) = AVC). 


Now suppose that B is a bounded operator on L?(K,) that com- 
mutes with A. 
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(a) Let ¢ = B1, where 1 denotes the constant function, so that 
¢@ € L?(K,). Show that for all continuous functions 7 on K, 
we have By = ow. 

(b) Using Exercise 3, show that for all bounded, Borel-measurable 
functions w on K, we have By = ow. 

(c) Show that ¢ is essentially bounded (i.e., bounded outside a set of 
pi-measure zero). Conclude that By = ow for all y € L?(K, p). 


8. If A € B(H) is self-adjoint, define U(t) € B(H) by U(t) = exp{itA} 
for each t € R, where the exponential is defined by the functional 
calculus for A. 


(a) Show that U(t) is unitary for all t and that U(s)U(t) = U(s + 
t). (A family of operators with this property is called a one- 
parameter unitary group.) 


(b) Show that the map t++ U(t) is continuous in the operator norm 
topology. 


(c) Give an example of a one-parameter unitary group on a Hilbert 
space that is not continuous in the operator norm topology. 


See Sect. 10.2 for more on one-parameter unitary groups. 


9 
Unbounded Self-Adjoint Operators 


9.1 Introduction 


Recall that most of the operators of quantum mechanics, including those 
representing position, momentum, and energy, are not defined on the en- 
tirety of the relevant Hilbert space, but only on a dense subspace thereof. 
In the case of the position operator, for example, given 7 € L?(R), the 
function X w(x) = v(x) could easily fail to be in L?(R). Nevertheless, the 
space of w’s in L?(R) for which zv(z) is again in L?(R) is a dense subspace 
of L7(IR). A closely related property of these operators is that they are not 
bounded, meaning that there is no constant C' such that 


| Av] < C|ldll 


for all ~ for which A is defined. Because our operators are unbounded, we 
cannot use the BLT (bounded linear transformation) theorem to extend 
them to the whole Hilbert space. 

In this chapter and the following one, we are going to study unbounded 
operators defined on dense subspaces of a Hilbert space H. We will in- 
troduce the “correct” notion of self-adjointness for unbounded operators, 
namely the one for which the spectral theorem holds. As it turns out, the 
obvious candidate for a definition of self-adjointness, namely that (¢, Aw) = 
(Ad, w) for all ¢ and w in the domain of A, is not the correct one. Rather, 
for any unbounded operator A, we will define another unbounded operator 
A*, the adjoint of A, with its own naturally defined domain. Then A is 
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said to be self-adjoint if A* and A are the same operators with the same 
domain. 

In the present chapter, we give the definition of an unbounded self-adjoint 
operator, along with conditions for self-adjointness and several examples 
and counterexamples. We defer a discussion of the spectral theorem itself 
until Chap. 10. The statement of the spectral theorem (either in terms of 
projection-valued measures or in terms of direct integrals) is essentially the 
same as in the bounded case, with only a few modifications to deal with 
the domain of the operator. 

Although this chapter is rather technical, a reader who is willing to ac- 
cept some things on faith may wish simply to read the definitions of self- 
adjoint and essentially self-adjoint operators in Sect. 9.2, and then skip to 
the statements of Theorem 9.21 and Corollary 9.22 in Sect. 9.5. As in pre- 
vious chapters, H will denote a separable Hilbert space over C. 


9.2 Adjoint and Closure of an Unbounded 
Operator 


Recall that we briefly introduced unbounded operators in Sect. 3.2. Accord- 
ing to Definition 3.1, an unbounded operator A on H is a linear map of some 
dense subspace Dom(A) C H (the domain of A) into H. As in Sect. 3.2, 
“unbounded” means “not necessarily bounded,” meaning that we permit 
the case in which Dom(A) = H and A is bounded. 

Now, if A is bounded, then for any ¢, the linear functional 


is bounded. Thus, by the Riesz theorem (Theorem A.52), there is a unique 
x such that 

(9, A+) = (X5") 
We then define the adjoint A* of A by setting A*é equal to x. (See 
Sect. A.4.) 

If A is unbounded, then (¢, A-) is not necessarily bounded, but may be 
bounded for certain vectors ¢. If (¢, A-) does happen to be bounded, for 
some $ € H, then the BLT theorem (Theorem A.36) says that this linear 
functional has a unique bounded extension from Dom(A) to all H. The 
Riesz theorem then tells us that there is a unique y such that this linear 
functional is “inner product with y.” This line of reasoning leads to the 
following definition, which was already introduced briefly in Sect. 3.2. 


Definition 9.1 Suppose A is an operator defined on a dense subspace 
Dom(A) Cc H. Let Dom(A*) to be the space of all 6 € H for which the 
linear functional 


prs (g, Av), ~ € Dom(A), 
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is bounded. For ¢ € Dom(A*), define A*¢ to be the unique vector such that 
(¢, Ap) = (A*¢, p) for all y € Dom(A). 


Saying that (¢, A-) is bounded means, explicitly, that there exists a con- 
stant C such that |(@, Ay)| < C ||w|| for ally) € Dom(A). As in the bounded 
case, the operator A* is linear on its domain, and is called the adjoint of A. 

Another way to think about the definition of A* is as follows. Given 
a vector ¢, if there exists a vector y such that (¢, Aw) = (y,w) for all 
w € Dom(A), then ¢ belongs to Dom(A*) and A*¢d = x. By the Riesz 
theorem, such a x will exist if and only if (¢, A-) is bounded, which means 
this way of thinking about A* is equivalent to Definition 9.1. 

Given a densely defined operator A, the adjoint A* of A could fail to 
be densely defined. This situation, however, is a pathology that does not 
usually occur for operators of interest in applications. 


Definition 9.2 An unbounded operator A on H is symmetric if 


(¢, Ap) = (Ad, v) (9.1) 
for all 6, € Dom(A). 


As we will see shortly, if A is symmetric, then A* is an extension of A, 
in the sense of the following definition. 


Definition 9.3 An unbounded operator A is an extension of an unbounded 
operator B if Dom(A) > Dom(B) and A= B on Dom(B). 


If A is an extension of B, then very likely A is given by the same “for- 
mula” as B. If H = L?(R), for example, both operators might be given 
by the formula —ih d/dx on their respective domains. Nevertheless, if 
Dom(A) 4 Dom(B), then A is still a different operator from B. 


Proposition 9.4 An unbounded operator A is symmetric if and only if A* 
is an extension of A. 


Proof. If A is symmetric, then for all 6 € Dom(A), (9.1) and the Cauchy— 
Schwarz inequality show that 


\(e, AY)| < [Ag dll, 


showing that ¢ € Dom(A*). In that case, (9.1) shows that the unique vector 
A*¢ for which (¢, AW) = (A*¢, w) is nothing but Ad, which means that A* 
agrees with A on Dom(A). 

In the other direction, if A* is an extension of A, then for each ¢ € 
Dom(A), we have 


(9, Ay) = (A*¢,¥) = (Ad, ), 


for all ~ € Dom(A), which shows that A is symmetric. ™ 
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We come now to the key definition of this section, that of self-adjointness. 
This notion constitutes the hypothesis of the spectral theorem for un- 
bounded operators. 


Definition 9.5 An unbounded operator A on H is self-adjoint if 
Dom(A*) = Dom(A) 
and A*¢ = Ad for all ¢ € Dom(A). 


We may reformulate the definition of self-adjointness by saying that A 
is self-adjoint if A* is equal to A, provided that equality of unbounded 
operators is understood to include equality of domains. Every self-adjoint 
operator is symmetric (by Proposition 9.4), but there exist many operators 
that are symmetric without being self-adjoint. In light of Proposition 9.4, 
a symmetric operator is self-adjoint if and only if Dom(A*) = Dom(A). In 
trying to show that a symmetric operator is self-adjoint, the difficulty lies 
in showing that Dom(A”*) is no bigger than Dom(A). 


Definition 9.6 An unbounded operator A on H is said to be closed if the 
graph of A is a closed subset of H x H. An unbounded operator A on H is 
said to be closable if the closure in H x H of the graph of A is the graph of 
a function. If A is closable, then the closure A@ of A is the operator with 
graph equal to the closure of the graph of A. 


To be more explicit, an operator A is closed if and only if the following 
condition holds: Suppose a sequence 7%, belongs to Dom(A) and suppose 
that there exist vectors vw and ¢ in H with vy, > w and Ay, > ¢. Then 
w belongs to Dom(A) and Aw = ¢. Regarding closability, an operator A is 
not closable if there exist two elements in the closure of the graph of A of 
the form (¢, wv) and (¢, x), with ~ 4 x. Another way of putting it is to say 
that an operator A is closable if there exists some closed extension of it, in 
which case the closure of A is the smallest closed extension of A. 

The notion of the closure of a (closable) operator is useful because it 
sweeps away some of the arbitrariness in the choice of a domain of an 
operator. If we consider, for example, the operator A = —ih d/dx as an 
unbounded operator on L?(IR), there are many different reasonable choices 
for Dom(A), including (1) the space of C® functions of compact support, 
(2) the Schwartz space (Definition A.15), and (3) the space of continuously 
differentiable functions ~ for which both w and w’ belong to L?(R). As it 
turns out, each of these three choices for Dom(A) leads to the same operator 
A“. Note that we are not claiming that every choice for Dom(A) leads to 
the same closure; nevertheless, it is often the case that many reasonable 
choices do lead to the same closure. 


Definition 9.7 An unbounded operator A on H is said to be essentially 
self-adjoint if A is symmetric and closable and A“! is self-adjoint. 
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Actually, as we shall see in the next section, a symmetric operator is 
always closable. Many symmetric operators fail to be even essentially self- 
adjoint. We will see examples of such operators in Sects. 9.6 and 9.10. Sec- 
tion 9.5 gives some reasonably simple criteria for determining when a sym- 
metric operator is essentially self-adjoint. 


9.3. Elementary Properties of Adjoints and Closed 
Operators 


In this section, we spell out some of the most basic and useful properties 
of adjoints and closures of unbounded operators. In Sect. 9.5, we will draw 
on these results to prove some more substantial results. In what follows, 
if we say that two operators “coincide,” it means that they have the same 
domain and that they are equal on that common domain. 


Proposition 9.8 1. If A is an unbounded operator on H, then the 
graph of the operator A* (which may or may not be densely defined) 
is closed in H x H. 


2. A symmetric operator is always closable. 


Proof. Suppose yw, is a sequence in the domain of A* that converges to 
some w € H. Suppose also that A*~, converges to some ¢ € H. Then 
(Wn, A:) = (A*yn,-) and for any x € Dom(A), we have 


(y, Ax) — jim, (Un, Ax) = Jim (A* dn, ~) = (0, x) . 


This shows that ~ belongs to the domain of A* and that A*~ = ¢, estab- 
lishing that the graph of A* is closed. 

If A is symmetric, A* is an extension of A. Since, as we have just proved, 
A* is closed, A has a closed extension and is therefore closable. 


Corollary 9.9 If A is a symmetric operator with Dom(A) = H, then A is 
bounded. 


Proof. Since A is symmetric, it is closable by Proposition 9.8. But since 
the domain of A is already all of H, the closure of A must coincide with 
A itself. (The closure of A always agrees with A on Dom(A), which in this 
case is all of H.) Thus, A is a closed operator defined on all of H, and the 
closed graph theorem (Theorem A.39) implies that A is bounded. m 


Proposition 9.10 If A is a closable operator on H, then the adjoint of 
A® coincides with the adjoint of A. 


Proof. Suppose that for some 7 € H there exists a ¢ such that (7, Ay) = 
(¢,x) for all y € Dom(A%). Since A@ is an extension of A, it follows 
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that (~, Av) = (¢, x) for all x € Dom(A). This shows that Dom(A*) D 
Dom((A“)*) and that A* agrees with (A‘)* on Dom((A“)*). 

In the other direction, suppose for some ~% € H there exists a ¢ such 
that (2, Ay) = (¢, x) for all y € Dom(A). Suppose now € € Dom(A“) with 
A“é = n. Then there exists a sequence x, in Dom(A) with xy, > € and 
Axn > 7, and we have 


(v, AXn) = (o, Xn) 


for all n. Letting n tend to infinity, we obtain (w,7) = (@,&), or Cp, A“e) — 
(¢, €). This shows that w € Dom((A“)*) and A“) = ¢. Thus, Dom(A*) C 
Dom((A“)*). = 


Proposition 9.11 Jf A is essentially self-adjoint, then A“ is the unique 
self-adjoint extension of A. 


Proof. Suppose B is a self-adjoint extension of A. Since B = B*, B is closed 
and is, therefore, an extension of A“. It then follows from the definition of 
the adjoint that Dom(B*) C Dom(A“). Thus, we have 


Dom(B*) C Dom(A“) C Dom(B). 


Since B is self-adjoint, all three of the above sets must be equal, so actually 
B=A°. = 


Proposition 9.12 If A is an unbounded operator on H, then 
(Range(A))> = ker(A*). 


Proof. First assume that 7 € (Range(A))+. Then for all ¢ € Dom(A) we 
have 


(b, Ad) = 0. 


That is to say, the linear functional (w,A-) is bounded—in fact, zero— 
on Dom(A). Thus, from the definition of the adjoint, we conclude that 
w € Dom(A*) and A*y = 0. 

Meanwhile, suppose that ~ is in Dom(A*) and that A*w = 0. The only 
way this can happen is if the linear functional (w, A-) is zero on Dom(A), 
which means that ~ is orthogonal to the image of A. 


Proposition 9.13 Suppose A is an unbounded operator on H and that B 
is a bounded operator defined on all of H. Let A+ B denote the operator 
with Dom(A + B) = Dom(A) and given by (A+ B)y = Aw + Bw for all 
w € Dom(A). Then (A+ B)* has the same domain as A* and (A+ B)*i) = 
A*y + B*w for all W € Dom(A*). 

In particular, the sum of an unbounded self-adjoint operator and a 
bounded self-adjoint operator (defined on all of H) is self-adjoint on the 
domain of the unbounded operator. 
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Proof. See Exercise 3. @ 
The sum of two unbounded self-adjoint operators is not, in general, self- 
adjoint. See Sect. 9.9 for more information about this issue. 


Proposition 9.14 Let A be a closed operator and X an element of C. 
Suppose that there exists ¢ > 0 such that 


(A — ADI] = € Ilo (9.2) 
for all A in Dom(A). Then the range of A— XI is a closed subspace of H. 


Here, we take the domain of the operator A — AI to coincide with the 
domain of A, as in Proposition 9.13. 
Proof. Assume that ¢, is a sequence in the range of A — AJ converging 
to some ¢. Then ¢, = (A — AI), for some sequence w,, in Dom(A). Ap- 


plying (9.2) with | = wy, — Wm shows that ||, — Um] < (1/e) \lén — dmll- 
This means that ~,, is Cauchy and thus convergent to some vector ~. Since 


Un > w and (A— AI) = bn — ¢, we have that 
A’’n = An + bn DAW 4+ @. 


Thus, by the definition of a closed operator, w € Dom(A) and Aw = A+ ¢. 
This means that (A — AJ)w = ¢ and so the range of A — AI is closed. m 

We conclude this section with a simple example for which we can compute 
the adjoint and closure explicitly. 


Example 9.15 Let (e;) be an orthonormal basis for H and let (A;) be 
an arbitrary sequence of real numbers. Define an operator A on H with 
Dom(A) equal to the space of finite linear combinations of the e;’s, with A 
itself defined by 

Ae; = Ajej- 


Then A is symmetric and closable and Dom(A*) = Dom(A“) = V, where 
V= p= > aye; So (1 +43) lagl? <0O>. (9.3) 
J J 
For any y= Lar, aje; in V, we have 
A*w = Aly = S- a5 Aj Ej. (9.4) 
J 
Thus, (A°')* = A* = A*, showing that A is essentially self-adjoint. 


Proof. Note that for any sequence (a;) of coefficients satisfying the condi- 
tion on the right-hand side of (9.3), we have 7, |a;|? < co and, thus, the 
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sum ay, aje; converges in H. Suppose first that @ = ay aje; belongs V. 
Then for any ~ = 7, bje; (finite sum) in the domain of A we have 


(¢, Ab) = S° GAjb; 
j 


and so by the Cauchy—Schwarz inequality, 
1/2 


\(o,AW)| < [| S2 A? Jas)? ] ell. 
j 


Thus, (¢, A-) is a bounded linear functional, showing that ¢ € Dom(A*). 
Furthermore, it is apparent that (¢, AW) = (x,w) for all / € Dom(A), 
where x = )7, ajAje;- 

Meanwhile, suppose ¢ = Da aj;e; belongs to the domain of A*, and 
consider wy := yy Aja je; in Dom(A). Then 


1/2 


N N 
Ce, Av) = SMF la? = | SOM lai? | ll. 


j=1 j=1 


Since ¢ € Dom(A’*), the functional (¢, A-) is bounded, and so )7™_, 2 |a;| 


4G 
must be bounded, independent of N, and so >), \% la;|" < = Since ¢ 
belongs to H, we have also that >, |a;|? < oo, showing that ¢ is in V. 
Turning now to the closure of A, it is apparent that A is symmetric and 
thus closable, by Proposition 9.8. Suppose ~ = }7, aje; belongs to V and 


consider wy := pee a,;e;. Clearly, wy converges to y. Furthermore, since 
w € V, we see that Ayy converges to the vector SF a;Aj;e;. This shows 
that ~ € Dom(A“™) and that A“ = 2 a;Aj;e;. Thus, each element of V 
belongs to Dom(A“) and A” is given on V by (9.4). 

Now, the space V forms a Hilbert space with respect to the norm given 
by 

2 2 
Mlly = A+ A) lal’, 


J 


where w = )/, aje;. [To establish completeness of V with respect to this 
norm, note that V can be identified isometrically with L?(N) with respect 
to the measure ju for which ({j}) = 1+ A3.] Suppose, now, that we have a 
sequence (yw) in Dom(A) for which both (q,) and (Aq) are convergent. 
Then (~)) forms a Cauchy sequence in V which converges to some element 
w of V. Since ||w||;; < ||vI|y for all » € Dom(A), we see that ~™ also 
converges in H to w € V. This shows that each element of Dom(A“) 
belongs to V. 
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Recall that if A is a bounded operator, then a number A € C belongs to 
the resolvent set of A if the operator A — AJ has a bounded inverse, and A 
belongs to the spectrum of A if A — AI does not have a bounded inverse. 
For an unbounded operator A, we will say that a number A € C is in the 
resolvent set of A if A — AI has a bounded inverse. That is, even though 
A is unbounded, for » to be in the resolvent set of A, there must be a 
bounded inverse to A — AI; otherwise, » is in the spectrum of A. We make 
this characterization more precise in the following definition. 


Definition 9.16 Suppose A is an unbounded operator on H. A number 
EC belongs to the resolvent set of A if there exists a bounded operator 
B with the following properties: (1) For allw € H, Bw belongs to Dom(A) 
and (A—XI) By = w, and (2) for all € Dom(A) we have B(A—AI) = w. 

If no such bounded operator B exists, then X belongs to the spectrum of A. 


Note that we are implicitly taking Dom(A — AJ) to equal Dom(A), as in 
Proposition 9.13. As in the bounded case, even if A is self-adjoint, points 
X in the spectrum of A are not necessarily eigenvalues; that is, there does 
not necessarily exist a nonzero w € Dom(A) with Ay = AW. On the other 
hand, if Aw = AW for some w € Dom(A), then A — AI is not injective and 
thus A certainly does belong to the spectrum of A. 


Theorem 9.17 Jf A is an unbounded self-adjoint operator on H, the spec- 
trum of A is contained in the real line. 


If A is symmetric but not self-adjoint, then the spectrum of A must 
contain points not in the real line. Indeed, Theorem 9.21 will show that at 
least one of (A — i) and (A+ iI) must fail to be surjective, and thus at 
least one of the numbers 7 and —? is in the spectrum of A. Nevertheless, a 
symmetric operator cannot have nonreal eigenvalues, as we showed already 
in Proposition 3.4. 

Proof. Consider a complex number 4 = a+ ib with b # 0. Since A is 
symmetric, the proof of Lemma 7.8 applies, giving 


(A — AL)o, (A — AL)d) = 0? (, d) (9.5) 


for all » € Dom(A). This shows that (A — AJ) is injective. 
Meanwhile, applying Propositions 9.12 and 9.13 with B = —AI we see 
that 


(Range(A — AI))+ = ker((A — AI)*) = ker(A* — AJ) = ker(A — XJ). 


Since \ again has nonzero imaginary part, A— AI is also injective, showing 
that Range(A — XJ) is dense in H. Since A = A* is closed, (9.5) allows us 
to apply Proposition 9.14 to show that Range(A — AJ) is closed, hence all 
of H. 
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We have shown, then, that (A— AJ) maps Dom(A) injectively onto H. It 
follows from (9.5) (or the closed graph theorem) that the inverse operator 
is bounded, so that » is in the resolvent set of A. 

Our next result shows that the spectrum of an unbounded self-adjoint 
operator has properties similar to that of a bounded self-adjoint operator. 


Proposition 9.18 If A is an unbounded self-adjoint operator on H, then 
the following hold. 


1. A number X € R belongs to the spectrum of A if and only if there 
exists a sequence WU, of nonzero vectors in Dom(A) such that 


_ |(A- AD pall 
lim ————~— = 
We eS Ill 


0. (9.6) 


2. The spectrum o(A) of A is a closed subset of R. 


Although the spectrum of a bounded self-adjoint operator is a bounded 

subset of R, the spectrum of an unbounded self-adjoint operator will be 
unbounded. Indeed, it can be shown (using the spectral theorem) that if 
a self-adjoint operator has bounded spectrum, then the operator must be 
bounded. 
Proof. For Point 1, if a sequence as in (9.6) existed, then as in the proof 
of Proposition 7.7, A — AJ could not have a bounded inverse, so \ must be 
in the spectrum of A. Conversely, suppose no such sequence exists. Then 
there is some € > 0 such that 


(A — AD] 2 € lvl (9.7) 


for all % € Dom(A). This means that A — AI is injective and that, by 
Proposition 9.14, the range of A — AI is closed. But 


(A=N) =A" =F =A =i 


and A — XI is injective, so by Proposition 9.12, the range of A — AI is all 
of H. This means A — AJ has an inverse, which is bounded by (9.7). Thus 
X is not in the spectrum of A. 

Point 2 is left as an exercise (Exercise 4). m 


Definition 9.19 Let A be an unbounded operator on H. Then A is non- 
negative if (w, Aw) > 0 for all » € Dom(A) and A is bounded below by 
cER if (wb, Av) > c|lW|? for all  € Dom(A). 


Proposition 9.20 Let A be an unbounded self-adjoint operator on H. If 
A is non-negative, then the spectrum of A is contained in [0,0o). More 
generally, if A is bounded below by c, then the spectrum of A is contained 
in [c, oo). 
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We will eventually see, using the spectral theorem for unbounded self- 
adjoint operators, that the converse to Proposition 9.20 also holds: If the 
spectrum of a self-adjoint operator A is contained in [0, 00), then A is non- 
negative, and if the spectrum of A is contained in [c, co), then A is bounded 
below by c. These results follow easily, for example, from the form of the 
spectral theorem in Theorem 10.9. 

Proof. Suppose A is bounded below by c and J is a point in the spectrum 
of A. If yp be a sequence as in Point 1 of Proposition 9.18, with the w,’s 
normalized to be unit vectors, then 


On the other hand, A = AJ + (A — AJ), and so 


Thus, (Wn, An) converges to A (= A (Wn, Un)) as n tends to infinity. Since 
A is bounded below by c, we must have A > c. This establishes the result 
for operators bounded below by c. Specializing to c = 0 gives the result for 
non-negative operators. 


9.5 Conditions for Self-Adjointness and Essential 
Self-Adjointness 


In this section, we give criteria for determining whether a symmetric oper- 
ator is self-adjoint or essentially self-adjoint. See also Sect. 10.2 for the con- 
nection between self-adjoint operators and one-parameter unitary groups. 


Theorem 9.21 If A is a symmetric operator on H, then A is essentially 
self-adjoint if and only if Range(A — iI) and Range(A + iI) are dense 
subspaces of H. 


Using Proposition 9.12, we can reformulate this result as follows. 


Corollary 9.22 If A is a symmetric operator on H, then A is essentially 
self-adjoint if and only if the operators A* + iI and A* — il are injective 
on Dom(A*). 


As Exercise 11 shows, it is possible to have one of the operators A* + iI 
and A* — iI be injective and the other fail to be injective. 
Proof of Theorem 9.21. Assume first that A is essentially self-adjoint, 
so that A“ is self-adjoint. Then A* = (A“)* = A“, and so 


[Range(A — iI)|> = ker(A* + iI) = ker(A@ + iI) = {0}, 


by Theorem 9.17, and similarly for the range of A + iJ. 
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Conversely, assume A is symmetric and that A — iJ and A+iI both 
have dense range. Since (A®)* = A* is a closed extension of A, it is also 
an extension of A, showing that A is symmetric. We may then apply 
Lemma 7.8—the proof of which requires only symmetry—to the operator 
A@ with A =i, giving 


(Ae — aryl]? > ol? (9.8) 


and showing that A“ — iI is injective. Since the range of A — il is dense, 
the range of A® — iI is certainly also dense. But since A is closed, (9.8) 
and Proposition 9.14 tell us that the range of A® — iI is closed, hence all 
of H. Similar reasoning shows that the range of A + iJ is also all of H. 

Now, by Proposition 9.13, (A“ —iI)* = (A“)* +iI, which is an extension 
of A“ + iI. Suppose (A“)* + iI is a proper extension of A“ + iI, that is, 
that the domain of (A“)* +I is strictly bigger than the domain of A +I. 
Then since A + iJ already maps onto H, (A“)* + iI cannot be injective. 
Thus, the operator 


(A“)* +40 = A* +40 = (A-il)* 


must have a nontrivial kernel. Then by Proposition 9.12, Range(A — 7J) is 
not dense, contradicting our assumptions. 

We conclude, therefore, that (A“)* + iI is not a proper extension of 
At +i, ie., that (A“)* + if = A“ + iI (with equality of domains). This, 
by Proposition 9.13, means that (A“)* = A* (with equality of domains), 
which is what we are trying to prove. @ 


Proposition 9.23 If A is a symmetric operator on H, then A is self- 
adjoint if and only of 


Range(A — 7J) = Range(A + iJ) = H. 


Proof. Suppose first that A is self-adjoint. Then by Theorem 9.21, the 
ranges of A — iJ and A+7/ are dense in H. On the other hand, 


(A — ezyw? > I? (9.9) 


by (the proof of) Lemma 7.8, with A = 7. Since, also, A = A%* is closed, 
Proposition 9.14 tells us that the range of A — iJ is closed, hence all of H. 
A similar argument shows that the range of A+ ‘J is all of H. 

Conversely, suppose that the ranges of A —iJ and A+iI/ are all of H. 
Then A is essentially self-adjoint by Theorem 9.21, so that A®* is self-adjoint. 
Since A — iI already maps onto H, if A* were a nontrivial extension of A, 
then A*—iI could not be injective. But (9.9), with A replaced by A*, shows 
that A* — il is injective. Thus, A = A* and so A is self-adjoint. = 

In the case that A is positive-semidefinite (i.e., (a, AW) > 0 for all a € 
Dom(A)), there is another self-adjointness condition, the proof of which is 
very similar to that of Theorem 9.22. 
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Theorem 9.24 Suppose that A is a symmetric operator on H and that 
(Ww, Aw) > 0 for all Ww € Dom(A). Then A is essentially self-adjoint if and 
only if A+ I has dense range. Equivalently, A is essentially self-adjoint if 
and only if A* +I is injective. 


Proof. Assume first that A is essentially self-adjoint. Then (A + J)* = 
A*+I= A“ 4. It is easily seen that A“ is also positive definite, and so 


(cb, (AT + Dw) = (cb, W) + (cb, AM) > (, Y) (9.10) 


Thus, A“ +I = (A+1)* is injective. Thus, the range of A+ I is dense, by 
Proposition 9.12. 

Now assume that A+J has dense range. By (9.10), A“ + is injective and 
by (9.10) and Proposition 9.14, the range of A®+T is closed, hence all of H. 
Assume Dom(A*) is strictly larger than Dom(A“). Then because A“ + is 
already surjective, A* +I (which has a domain equal to the domain of A*) 
cannot be injective. Thus, A* +7 = (A+J)* has a nontrivial kernel, which 
means that the range of A+ J is not dense. This is a contradiction, and 
so the domain of A* must actually be equal to the domain of A“. Since A 
and so also A“ are symmetric, this means that A“ is self-adjoint. m™ 


Example 9.25 Suppose that A is a symmetric operator on H. that has 
an orthonormal basis of eigenvectors. That is to say, suppose there is an 
orthonormal basis {e;} for H such that for each j, we have e; € Dom(A) 
and Ae; = A;e; for some real number A;. Then A is essentially self-adjoint. 


This result is a strengthening of Example 9.15, in that we do not assume 
that the domain of A is equal to the space of finite linear combinations of 
the e;’s. 

Proof. For any j, (A — il)e; = (A; — i)e;. Since A; is real, we have a 
nonzero multiple of e; belonging to Range(A — iI), for each 7. This shows 
that Range(A — iJ) is dense, and similarly for Range(A+7/). 


Example 9.26 Suppose H is a Hilbert space direct sum of a sequence of 
separable Hilbert spaces H;: 


H = Qu,. 
j=l 


Suppose also that A; is a bounded self-adjoint operator on H;, for each j. 
Define a subspace V of H by 


Y=} b= (ita...) [D> (llvall + Adsl) < 00 


Suppose now that A is a symmetric operator on H whose domain contains 
the finite direct sum of the H;’s and such that Alyy, = A;. Then A is 
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essentially self-adjoint, Dom(A“) = Dom(A*) = V, and 
Atay = A*b = (Arq, Aaa, .-.) (9.11) 


for all b= (W1, Wa,...) inV. 


See Definition A.45 for the definition of the Hilbert direct sum and the 

finite direct sum of a sequence of Hilbert spaces. Example 9.25 is the special 
case of Example 9.26 in which each H; has dimension 1. This result will 
be useful to us in Chap. 10. 
Proof. Since A; is self-adjoint, the ranges of A; — iJ and A; + il are 
dense in H;. Thus, the closure of the range of A — iJ contains each H; 
and is therefore dense in H, and similarly for A +i. This shows that A is 
essentially self-adjoint. 

It remains to show that the domain of A* = A“ is V. Let W denote the 
finite direct sum of the H,’s. By the argument in the previous paragraph, 
Aly is essentially self-adjoint. Then A* is a symmetric extension of (Al y,)*, 
which must coincide with (A|y,)*. Thus, it suffices to consider the case 
Dom(A) = W. 

If we assume that Dom(A) = W, we can compute the adjoint of A by the 
argument in Example 9.15. If ¢ € V, then the Cauchy—Schwarz inequality 
shows that the linear functional (¢, A-) is bounded and that A*¢ is as 
(9.11). On the other hand, if (@, A-) is bounded, where @ = (41, ¢2,...), 
take 

wn = (61, ¢2,---, On; 0,0,...). 


Then, as in the proof of Example 9.15, the only way we can have |(¢, Atby)| < 
C ||wn|| is if d belongs to V. = 


9.6 A Counterexample 


In this section, we will examine an elementary example of an operator that 
is symmetric but not essentially self-adjoint. Our example will be essen- 
tially the momentum operator on a finite interval, with “wrong” boundary 
conditions. (A more sophisticated example is given in Sect. 9.10.) We take 
our Hilbert space to be L?({0, 1]). 


Proposition 9.27 Let Dom(A) c L?({0,1]) be the space of continuously 
differentiable functions f on [0,1] satisfying 
(0) = (1) =0. 


For » € Dom(A), define 
a 
dx 


Ay = —ih 


Then A is symmetric but not essentially self-adjoint. 
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We can understand the failure of essential self-adjointness of A in prac- 
tical terms as a failure of the spectral theorem. The eigenvector equation 
Aw = dy for  € R is a first-order ordinary differential equation, whose 
general solution is W(x) = ce”, where c is a constant. The only way such a 
function can satisfy the boundary conditions w(0) = (1) = 0 is if c = 0, in 
which case w is the zero vector. Thus, A has no eigenvectors. Furthermore, 
taking the closure of A does not help, because, as the proof will show, the 
boundary conditions survive taking the closure. 

Proof of symmetry. Using integration by parts we see that for all ¢ and 
w in Dom(A) we have 


‘de 


[ FAB te = AH) - THO) - [ ote) da. 0.12) 


Since we assume ¢ and w are in Dom(A), the boundary terms are zero and 


we get 
(68) (Ene 
dex / 12((0,1)) dx’ "| 12((0,1)) 


Because there is a conjugate in one side of the inner product but not the 
other, it follows that 


(one) = (inh) 
dex / 12((09,1)) dx" 12((0,1)) 


as claimed. m 
We now consider A® and A* = (A“)*. We will see that there are elements 
of the domain of the adjoint that are not in the domain of the closure. 


Lemma 9.28 If ¢ is a continuously differentiable function on [0,1], then 
@ € Dom(A*) and A*¢ = —ih dd/dz. 


Proof. If ¢ is continuously differentiable, then for any w in Dom(A), we 
may integrate by parts as in (9.12). Since w is zero at both ends of the 
interval, the boundary terms vanish and we obtain 


_ 
d 
(9, Av) =ih | P(e) ae 


a i (-in ow de (9.13) 


Since dé/dx is continuous and hence in L?({0,1]), we see that (9.13) is a 
continuous linear functional, as a function of w with fixed ¢. Thus, w is in 
the domain of A*, and A*¢ = —i d¢é/dx. 

Proof of Proposition 9.27. Suppose w is in the domain of A®!. Then 
there exist W, in Dom(A) such that w,, converges to w and Ay, converges 
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to some y € L?([0,1]). Since the derivatives of the q,’s are converging in 
L?, the y,’s themselves must be converging uniformly, as can be shown by 
writing each y,, as the integral of its derivative. (See Exercise 10.) It follows 
that every element of Dom(A“) is continuous and vanishes at both ends of 
the interval. On the other hand, Dom(A*) contains all smooth functions, 
including many that do not vanish at the ends of the interval. Thus, A“ 
and (A“)* = A* do not have the same domains. m 

It follows from Lemma 9.28 that every complex number belongs to the 
spectrum of A“. See Exercise 9. 

The reason that A fails to be essentially self-adjoint is that we impose too 
many boundary conditions on functions in the domain of A, which results 
in there being too few boundary conditions (in this case, no boundary 
conditions at all) on functions in the domain of A*. In this example, A* is 
given by the same formula as A (—id/dz in both cases), but the domain of 
A* is bigger than the domain of A“. 

Suppose we define another operator B, still given by the formula —i d/dz, 
but with the domain of B to be the space of continuously differentiable 
functions w with w(0) = w(1). If we integrate by parts as in (9.12), the 
boundary terms will cancel, showing that B is symmetric. Meanwhile, the 
functions (x) := e2?7'"*, n € Z, form an orthonormal basis for L?((0, 1}) 
consisting of eigenvectors for B, with real eigenvalues \,, = 27n. Thus, by 
Example 9.25, B is essentially self-adjoint. 


9.7 An Example 


We now give an example of an operator that is essentially self-adjoint. Let 
CS(R) denote the space of smooth, compactly supported functions on R. 





Proposition 9.29 Let P be the densely defined operator with Dom(P) = 
C(R) c L?(R) and given by Py) = —ih db/dx. Then P is essentially 
self-adjoint. 











Proof. Our strategy is to apply Corollary 9.22. Since P is symmetric, we 
expect that P* will be given by the formula —ih d/dx, on some suitable 
domain inside L?(R). Thus, if w € ker(P* + iI), this should mean that 
—th dip/dx = —ip, or dp/dx = (1/h)w(x), which ought to imply that 
w(x) = ce*/", for some constant c. Since ce*/” belongs to L?(R) only if 
c = 0, we hope to conclude that wy = 0. 

To say that w € L?(R) belongs to the kernel of P* + iI means that w 
belongs to Dom(P*) and that P*w = —iw. This holds if and only if 


~it f Xue) dx =i [ XH) dix 
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for all x € C%°(R). For any € € C%°(R), if we take x(x) = €(x)e~*/" and 
combine the integrals into one, we get 


0=-i [ e-a/nct — -**ECay + e-*/"Ea)| (an) dv 
= —th f Beye) dx. (9.14) 


Now, (9.14) says that the derivative of e~*/"4(z) in the weak or distribu- 
tional sense is zero. (See Proposition A.29 in Appendix A.3.3.) Thus, by the 
remarks immediately following Proposition A.5, we must have e~*/"q)(x) = 
c for some c, meaning that u(x) = ce*/”. Since we also assume that a be- 
longs to Dom(P*) c L?(R), we must have c = 0, so that 7 is the zero 
element of L?(R). 

We have shown, then, that only 0 belongs to the kernel of P* + iJ. A 
similar argument with i replaced by —i and e*/” by e~*/" shows that only 
0 belongs to the kernel of P* — iI. Thus, by Corollary 9.22, P is essentially 
self-adjoint. ™ 


9.8 The Basic Operators of Quantum Mechanics 


In this section, we consider several of the unbounded self-adjoint operators 
that arise in quantum mechanics. We find natural domains of self- ad- 
jointness for the position, momentum, kinetic energy, and potential energy 
operators. Since Schrédinger operators are more complicated to analyze, 
we postpone a discussion of them until the next section. We begin with the 
potential energy operator. 


Proposition 9.30 Suppose V : R” — R is a measurable function. Let 
V(X) be the unbounded operator with domain 


Dom(V(X)) = {w € L?(R”) |V(x)p(x) € L7(R”) } 
and given by 
[V(X)b] (x) = V(x)v(x). 
Then Dom(V(X)) is dense in L?(R") and V(X) is self-adjoint on this 


domain. 
Proof. Define a subset E,, of R” by 
Em = {x € R"||V(x)| <m}, 


so that Um E; = IR”. Then for any ~ € L?(R”), the function 71z,, belongs 
to Dom(V (X)). On the other hand, using dominated convergence, we have 
wle,, > y as m — oo, establishing that Dom(V (X)) is dense. 
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Since V is real-valued, it is easy to see that V(X) is symmetric on 
Dom(V (X)). Thus, V(X)* is an extension of V(X). 
Meanwhile, suppose ¢ € Dom(V(X)*), meaning that 


pro [ Vee) dx, w€ Dom(V(X)) (9.15) 


is a bounded linear functional. This linear functional has a unique bounded 
extension to L? and, thus, Thus, there exists a unique x € L?(IR”) such 
that 


‘| P@V(a)o(a) de = | X@ (a) ae, (9.16) 
xX x 


or 





I [POV (2) — X@)] le) dx = 0 


for all ¢ € Dom(V(X)). 

Taking ¢ = (WV —x)1z,,, we see that wV — xy is zero almost everywhere 
on Em, for all m, hence zero almost everywhere on R”. Thus, WV is equal 
to x as an element of L?(R"). This shows that ~ € Dom(V(X)). Thus, 
actually, Dom(V(X)*) = Dom(V(X)). Since we have already shown that 
V(X)* is an extension of V(X), we conclude that V(X) is self-adjoint on 
Dom(V(X)). 

If we specialize the preceding proposition to the case V(x) = 2;, we 
obtain the following result about the position operator. 


Corollary 9.31 The position operator X,; is self-adjoint on the domain 
Dom(X;) = {y € L?(R”) |a;v(x) € L7(R")}. 


We now turn to consideration of the momentum operator. Since the 
Fourier transform converts 0/0x; into multiplication by 7k; (Proposition 
A.17) we can use the preceding results on multiplication operators to obtain 
a natural domain on which the momentum operator is self-adjoint. 


Proposition 9.32 For each j = 1,2,...,n, define a domain Dom(P;) C 
L?(R") as follows: 

Dom(P;) = {¢ € 1(R") |kyb(k) € 22(R")}, 
where wb is the Fourier transform of w. Define P; on this domain by 


Py = F* (ikyh(k)). 


Then P; is self-adjoint on Dom(P;). 

The domain Dom(P;) of P; can also be described as the set of all w € 
L?(R") such that Ow/Ox;, computed in the distribution sense, belongs to 
L?(R"). For any € Dom(P;), we have Pj = —ihOw/Ox;, where Ow /Ox; 


is computed in the distribution sense. 





9.8 The Basic Operators of Quantum Mechanics 187 


Saying that the distributional derivative of ~ belongs to L?(R”) means 
(Proposition A.29) that there exists a (unique) ¢ in L?(R”) such that 


- (0) = (4,4) 


for all y € CS°(R”). If ~ is continuously differentiable, then the distribu- 
tional derivative of w coincides with the ordinary derivative of . Thus, if 
w € L?(R®”) is continuously differentiable, then 7 belongs to Dom(P;) if 
and only if 0w/0x;, computed in the pointwise sense, belongs to L?(R"), 
in which case Pj = —ihOw/Ox;. On the other hand, if 7 € Dom(P,), it is 
not necessarily the case that w is continuously differentiable. 

In the case n = 1, the domain of P; certainly contains Co°(R), since each 
element 7 of C&°(R) is a Schwartz function (Definition A.15), so that ~ 
is also a Schwartz function, in which case ky)(k) belongs to L?(R). Now, 
as shown in Sect.9.7, the operator —ihd/dx is essentially self-adjoint on 
CS (IR), which means that this operator has a unique self-adjoint extension. 
This self-adjoint extension must, therefore, agree with the operator P, in 
the n = 1 case of Proposition 9.32. 


Lemma 9.33 Suppose wy € L?(R") has the property that Ow/Ox;, com- 
puted in the distribution sense, is equal to an L? function 6. Then o(k) = 
ikjb(k), showing that kj tb(k) belongs to L?(IR"). 

Conversely, suppose y) € L?(IR") has the property that kjtb(k) belongs to 
L?(R"). Then Oy/0x;, computed in the distribution sense, is equal to the 
L? function F—'(ikj;F(w)). 


Proof. Suppose 0w/0x,;, computed in the distribution sense, is equal to the 
L? function ¢ (see Definition A.28). Then by the unitarity of the Fourier 
transform (Theorem A.19) and its behavior with respect to differentiation 
(Proposition A.17), we have 


= — (ikjF(x),F(Y)), 
for all y € CS°(R). Thus, 


(F(x), F()) = — (ky F(x), FH), x € CO (R). 


Writing this equality out as an integral, we have 





[X60 ae =— [TRIG IG a 





= /. Uk)ikjp(k) dk (9.17) 


for all x € Co°(R”). 
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We now claim that because (9.17) holds for all y € CS°(R"), we must 
have ¢(k) = ik;y(k) for almost every k. Using the Stone—Weierstrass the- 
orem and Theorem A.10, it is not hard to show that the space of smooth 
functions with support in [a,b] is dense in L?({a,b]), for alla < bE R. 
Since both ¢ and k,w(k) are locally square-integrable, we see that these 
two functions are equal almost everywhere on [a,b], for alla < b € R, and 
hence equal almost everywhere on R. 

Since ¢ is globally square-integrable, so is kjtb(k). Furthermore, by the 
injectivity of the L? Fourier transform, we have 


Ow ip: 
any p= F (ik F(p)) 
as claimed. 
The argument for the second part of the lemma is similar and left as an 
exercise (Exercise 12). m™ 
Proof of Proposition 9.32. By Proposition 9.30, the operator of mul- 
tiplication by k; is an unbounded self-adjoint operator on L*(R"), with 
domain equal to the set of ¢ for which k;¢(k) belongs to L?(R”). It then 
follows from the unitarity of the Fourier transform that P; = hF —IVy, key is 
self-adjoint on F~'(Dom(Mg,)), where M;,, denotes multiplication by k;. 
The second characterization of Dom(P;) follows from Lemma 9.33. m 


Proposition 9.34 Define a domain Dom(A) as follows: 


Dom(A) = {v € L2(R") 





Ik|? (ke) © 2?(R") }. 
Define A on this domain by the expression 
Ag = —F-*(\k|? (kk), (9.18) 


where ~) is the Fourier transform of w and F~' is the inverse Fourier. 
Then A is self-adjoint on Dom(A). 

The domain Dom(A) may also be described as the set of all y € L?(R") 
such that Aw, computed in the distribution sense, belongs to L?(R"). If 
w € Dom(A), then Aw as defined by (9.18) agrees with Ay computed in 


the distribution sense. 


The proof of Proposition 9.34 is extremely similar to that of Proposi- 
tion 9.32 and is omitted. Of course, the kinetic energy operator —h?A/(2m) 
is also self-adjoint on the same domain as A. It is easy to see from (9.18) 
and the unitarity of the Fourier transform that —h?A/(2m) is non-negative, 


that is, that 
h2 
2m 


for all ~ € Dom(A). 
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Using the same reasoning as in Sects. 9.6 and 9.7, it is not hard to show 
that the operators P; and A are essentially self-adjoint on C>°(IR”). See 
Exercise 16. 

Care must be exercised in applying Proposition 9.34. Although the func- 
tion 


[xl 


is harmonic on R*\{0}, the Laplacian over R? of ~ in the distribution 
sense is not zero (Exercise 13). (It can be shown, by carefully analyzing the 
calculation in the proof of Proposition 9.35, that Aw is a nonzero multiple 
of a 6-function.) This example shows that if a function w has a singularity, 
calculating the Laplacian of ~ away from the singularity may not give the 
correct distributional Laplacian of 7. For example, the function ¢ in L?(R*) 
given by 


2 
ell 


O(x) = (9.19) 


|x| 





is not in Dom(A), even though both ¢ and A¢ are (by direct computa- 
tion) square-integrable over R*\{0}. Indeed, when n < 3, every element of 
Dom(A) is continuous (Exercise 14). 


Proposition 9.35 Suppose w(x) = g(x) f (|x|), where g is a smooth func- 
tion on R” and f is a smooth function on (0,co). Suppose also that f 
satisfies 


If both w and Aw are square-integrable over R"\{0}, then w belongs to 
Dom(A). 


Note that the second condition in the proposition fails ifn = 3 and 
f(r) =1/r. We will make use of this result in Chap. 18. 
Proof. To apply Proposition 9.34, we need to compute (w, Ay), for each 
x € C&(R”). We choose a large cube C, centered at the origin and such 
that the support of x is contained in the interior of C. Then we consider 
the integral of (0? /0x7) over C\C-, where C- is a cube centered at the 
origin and having side-length ¢. We evaluate the x;-integral first and we 
integrate by parts twice. For “good” values of the remaining variables, x; 
ranges over all of C’, in which case there are no boundary terms to worry 
about. For “bad” values of the remaining variables, we get two kinds of 
boundary terms, one involving u(0x/0x;) and one involving (0w/dz;)x, 
in both cases integrated over two opposite faces of C,. 


Now, 
Ow _ Og ; df x; 
Fay = Daf +900 Ge 
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Since the area of the faces of the cube is e”~!, the assumption on f will 
cause the boundary terms to disappear in the limit as € tends to zero. 
Furthermore, both w and Ay are in L?(R") and thus in L1(C), where in 
the case of Aw, we simply leave the value at the origin (which is a set of 
measure zero) undefined. Thus, integrals of Ay and (Aw)x over C\C- 
will converge to integrals over C. Since the boundary terms vanish in the 
limit, we are left with 


(~, Ax) = (Ad, x) . 


Thus, the distributional Laplacian of ~ is simply integration against the 
“pointwise” Laplacian, ignoring the origin. Proposition 9.34 then tells us 
that ~ € Dom(A). m 


9.9 Sums of Self-Adjoint Operators 


In the previous section, we have succeeded in defining the Laplacian A, 
and hence also the kinetic energy operator —h?A/(2m), as a self-adjoint 
operator on a natural dense domain in L?(R"). We have also defined the 
potential energy operator V(X) as a self-adjoint operator on a different 
dense domain, for any measurable function V : R” — R. To obtain the 
Schrédinger operator —h?A/(2m)+V(X), we “merely” have to make sense 
of the sum of two unbounded self-adjoint operators. This task, however, 
turns out to be more difficult than might be expected. In particular, if 
V is a highly singular function, then —h?A/(2m) + V(X) may fail to be 
self-adjoint or essentially self-adjoint on any natural domain. 


Definition 9.36 If A and B are unbounded operators on H, then A+ B 
is the operator with domain 


Dom(A + B) := Dom(A) NM Dom(B) 
and given by (A+ B)w = Ay + By. 


The sum of two unbounded self-adjoint operators A and B may fail to be 
self-adjoint or even essentially self-adjoint. [If, however, B is bounded with 
Dom(B) = H, then Proposition 9.13 shows that A+ B is self-adjoint on 
Dom(A) ™ Dom(B) = Dom(A).] For one thing, if A and B are unbounded, 
then Dom(A) M Dom(B) may fail to be dense in H. But even if Dom(A) M 
Dom(B) is dense in H, it can easily happen that A+ B is not essentially 
self-adjoint on this domain. (See, for example, Sect. 9.10.) Many things that 
are simple for bounded self-adjoint operators becomes complicated when 
dealing with unbounded self-adjoint operators! 

In this section, we examine criteria on a function V under which the 


Schrodinger operator 


. h2 
H=—-—AiV 
2m 
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is self-adjoint or essentially self-adjoint on some natural domain inside 
L?(R"). 


Theorem 9.37 (Kato—Rellich Theorem) Suppose that A and B are 
unbounded self-adjoint operators on H. Suppose that Dom(A) C Dom(B) 
and that there exist positive constants a and b with a <1 such that 


| By] < a || Ag + 6 [dll (9.20) 


for all € Dom(A). Then A+ B is self-adjoint on Dom(A) and essentially 
self-adjoint on any subspace of Dom(A) on which A is essentially self- 
adjoint. Furthermore, if A is non-negative, then the spectrum of A+ B is 
bounded below by —b/(1—a). 


Note that since we assume Dom(B) > Dom(A), the natural domain for 
A+ B is Dom(A) MN Dom(B) = Dom(A). An operator B satisfying (9.20) 
is said to be relatively bounded with respect to A, with relative bound a. 
Proof. We use the trivial variant of Theorem 9.21 given in Exercise 8. 
Choose a positive real number yp large enough that a+ b/ < 1, which is 
possible because we assume a < 1. Then for any w € Dom(A), we have 


(A+ B+ ipl) = (B(A+ipl)' +1) (At iplyy. (9.21) 
For any w € H, we compute that 
|| B(A + inl)" < a || A(A + tnt) "|| + 0] [(A + tet)“ ty] 
b 
< (a+) I. (9.22) 
Here we have made use of the estimates 
|A(A + iu0)“1 |] <1, |J(A+ int)? < 7 


both of which are elementary (Exercise 17). 
If C denotes the operator B(A + iul)~+, (9.22) tells us that ||C|| < 
(a+b/p) <1. Thus, by Lemma 7.6, C+ is invertible. Furthermore, since 
A is self-adjoint, A+ iJ maps Dom(A) onto H. Thus, (9.21) tells us that 
A+ B+ il also maps Dom(A) onto H. The same argument shows that 
A+ B- iI maps Dom(A) onto H and we conclude, by Exercise 8, that 
A+ B is self-adjoint on Dom(A). 

Suppose, in addition, that A is non-negative. Let us replace ij by A > 0, 
in (9.21). Calculating as in (9.22), using the estimates in Exercise 18, we 
obtain that 





||B(A + ADT)" || < (« + 5) \|2)|| 


for all » € H. If \ > 6/(1 —a), then a+ b/A < 1, and by the above 
argument, Range(A+ B+ AI) = H. Furthermore, since A+ B+ XI is self- 
adjoint, Proposition 9.12 tells us that ker(A + B+ AI) = {0}. This shows 
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that A+ B+ XI is invertible and —) is in the resolvent set of A+ B. We 
conclude, then, that the spectrum of A+B is contained in [—b/(1—a), +00). 

The last part of the theorem, concerning essential self-adjointness, is left 
as an exercise (Exercise 19). ™ 


Theorem 9.38 Suppose n is at most 8 and V : R” > R is a measur- 
able function that can be decomposed as a sum of two real-valued, mea- 
surable functions Vi and V2, with Vi belonging to L?(IR") and V2 being 
bounded. Then the Schrédinger operator —h?A/(2m)+V(X) is self-adjoint 
on Dom(A). Furthermore, —h?A/(2m) + V(X) is bounded below. 


Implicit in the statement of the theorem is that Dom(V(X)), as given 
in Proposition 9.30, contains Dom(A). A result similar to Theorem 9.38 in 
R”, n > 4, but the condition that V,; belongs to L?(IR”) is replaced by the 
condition that Vi belongs to L?(R”) for some p > n/2. See Theorem X.20 
in Volume II of [34]. 

Proof. We apply the Kato—Rellich theorem with A = —h?A/2m and B = 
V(X). Assume ~ € Dom(A) and fix some <¢ > 0. By Exercise 14, there 
exists a constant c- such that 


(x) < €[|Ad|] + ce loll 
for all x € R”. Thus, if V is as in the theorem and w € Dom(A), 


|Vbl] < sup |b(x)| [Vil] + sup |V2(x)| [le 
<e|Vil| [Ad + (ce [Val] + sup |Va(x)]) [lvl 


This shows that Dom(V(X)) D Dom(A). Since « is arbitrary, we can 
arrange for the constant in front of ||Ay|| to be less than one and the 
Kato—Rellich theorem applies. m 


Theorem 9.39 Suppose n is at most 8 and V : R” > R is a measur- 
able function that can be decomposed as a sum of three real-valued, mea- 
surable functions Vi, V2, and V3, with Vi belonging to L?(IR"), Vo being 
bounded, and V3 being non-negative and locally square-integrable. Then 
the Schrodinger operator —h?A/(2m) + V(X) is essentially self-adjoint on 
CS (R”). 

The proof of this result would take us too far afield and is omitted. See 
Theorem X.29 in Volume II of [34]. Note that we assume only that V3 is 
non-negative and locally square-integrable; V3 can tend to +00 arbitrarily 
fast at infinity. Again, the same result applies in R", n > 4, if the condition 
on V, is replaced by the assumption that V,; € L?(R”) for some p > n/2. 


Proposition 9.40 Fiz a and b in R” and let a. X + b- P denote the 
operator given by 


(a-X+b- P)b(x) = (a- x)y(x) — a 
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Then a-X+b-P is essentially self-adjoint on C°(R”). 


Proof. We use the same strategy as in Sect.9.7, namely we explicitly 
solve the equation A*7 = +iw and find that there are no nonzero, square- 
integrable solutions. 

The case b = 0 is not hard to analyze and is left as an exercise (Ex- 
ercise 20). Assume, then, that b 4 0. By making a rotational change of 
variables, we can assume that b = ae, and a= Ge; + Yeo, so that 

aw 


(Aw) (x) = (B21 + yr2)¥(x) — thas (9.23) 





(If mn = 1, the ya2 term is not present.) As in the proof of Proposition 9.29, 
the adjoint A* of A will be given by the same formula as A, with Dom(A*) 
consisting of those elements 7 of L?(IR”) for which the right-hand side of 
(9.23), computed in the distributional sense, belongs to L?(R"). 

We now apply the criterion for essential self-adjointness in Corollary 9.22. 
We need to show that the equations A*y = iy and A*w = —iw have no 
nonzero solutions in Dom(A*). After rewriting the equation A*y = iy) as 


(Br +722)0(x) — Fo), (9.24) 


O 





1 
Ox ha 


we can easily find the general distributional solution as 





ip iy 1 
W(x) = c(x2,..-, Ln) exp Sahl ahi? watt} : (9.25) 


[It is easily verified that if we let ¢ equal y) divided by the exponential on the 
right-hand side of (9.25), then @ satisfies 0¢/0x1 = 0 in the distributional 
sense. Exercise 21 then tells us that ¢ must be a function of ,...,27.] 
Since the exponential factor is never square integrable as a function of 11 
with x2 fixed, the only way that wv can be square integrable is if c is zero 
for almost every value of (#2,...,2,), in which case W is the zero element 
of L?(R”). A similar argument shows that the equation A*q = —iy has no 
nonzero solutions. 


9.10 Another Counterexample 


In this section, we will show that the Schrédinger operator H = P?/(2m)— 
X? is not essentially self-adjoint on C&(R), even though H is certainly 
symmetric. By contrast, P?/(2m) + X@ is essentially self-adjoint, by The- 
orem 9.39. The operator P?/(2m) — X* is a more serious counterexample 
than the one in Sect.12.2, in that it does not involve any obviously in- 
correct choice of boundary conditions. On the other hand, it should not 
be surprising that something goes “wrong” in a quantum system with a 
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potential equal to —2*. After all, a classical system with this potential has 
trajectories that go to infinity in finite time (see Exercise 4 in Chap. 2). 

To show that H is not essentially self-adjoint, we will show that the 
adjoint H* is not symmetric. Suppose wis a C™ function such that both 
w and the function 


— Fyre) — 2 (o) (9.26) 


belong to L?(R). Using integration by parts, as in the proof of Lemma 9.28, 
we can see that a is in the domain of H* and H*y~ is the function in (9.26). 
We will construct an approximate eigenvector ~ € Dom(H”*) for A* with 
an imaginary eigenvalue ia, which will show that H* is not symmetric and 
thus H is not essentially self-adjoint. 


Theorem 9.41 Define an operator H with Dom(H) = CS°(R) by the for- 
mula 


: Re, 


2m da? 


Then H is not essentially self-adjoint. 


In preparation for the proof, let us define a function p(x) on R such that 





that is, 
p(x) = V2mvV/ x4 + ia. (9.27) 


Here we take the square root that is in the first quadrant. The function 
p(x) represents “the momentum of a classical particle with energy ia.” 


Lemma 9.42 If uw, is given by 


vale) = enn ds [vty ay} (9.28) 


then qo belongs to L?(R) and the function 


he da 


a, dt 
Sn gr ta (9.29) 





also belongs to L?(IR). Furthermore, we have 





a ae ae h? 
am dee 4 6 ta Wa(x) = — ay ale)ma(z), 
where 
5 x x 
Ma (x) = 
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It will be apparent from the proof that the two terms in (9.29) are not 
separately in L?(R). The motivation for the definition of 7, comes from 
the WKB approximation (Chap. 15) with a complex value for the energy. 
Proof. Let us consider the integral of p, 


| p(y) dy = vim | Vyt + ia dy. 
0 0 
Using the power series for (1 + x)* we see that for large y, 
Vyttia=yV1+ia/y! =y? ee ey a 
2y4 yr) )- 


From this estimate, it is easy to see that the imaginary part of c p(y) dy 
remains bounded as x tends to too. It follows that the exponential in the 
definition of ~ is bounded, from which it is easy to see that wv is square 
integrable. 

Now, using the formula for the second derivative of a product, we obtain 


_poiy —|2@L_ 27 >{ 1 pla) \ ip(x) 
"age aa Ee) an ( rou) i 


= oas exp 5 [ ot) ay . (9.30) 


The factor of 1/,/p(a) in the definition of 7. was chosen precisely so that 
the second and third terms in square brackets will cancel. If we replace 
p*(x) in the numerator of the first term by 2m(a* + ia), we obtain 





2 





m 


(2) — aba — iowa =- (pay?) eo {Ef vty) a 
—-— - a — eo =—-— | pla xpd : 
dy Pelt) — & iv dn \ age? PLR I, Pw) wy 
It is then an elementary calculation to show that 

- 5 
@ aye = p(x)~1/? Ro +ia)~72° — 8(a* + ia)-t2?| : 
from which the lemma follows. ™ . 
Proof of Theorem 9.41. If H were essentially self-adjoint, H* (which 


would coincide with H ¢!) would be self-adjoint and, in particular, symmetric. 
If this were the case, we would have, by the proof of Lemma 7.8, 


(a — ial), (H* — ial)w) > a? (a), b) (9.31) 


for all =» € Dom(H*) and a € R. But if wa is the function in Lemma 9.42, 
the discussion preceding Theorem 9.41 shows that Wa belongs to Dom(H"). 
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Furthermore, it is easily verified that there is a constant C’ such that 
|ma(x)| < C for alla > 1 and « € R. Thus, for all sufficiently large 
a, we have 


he 2 oe 2 2 2 
| (a - iatyal] < Z5C llvall? <0? pall, 








contradicting (9.31). 
See Exercise 22 for a more explicit approach to showing that H™* is not 
symmetric. 


9.11 Exercises 


1. Show that an unbounded operator A fails to be closable if and only 
if the closure of the graph of A contains an element of the form (0, 2) 
with ~ £0. 


2. Define an unbounded operator A on L?({0, 1]) with domain Dom(A) = 
C((0, 1]) by 
Af = f(0)1, 


where 1 is the constant function. Show that A is not closable. 
3. Prove Proposition 9.13. 


4. Suppose that A is an unbounded self-adjoint operator on H and that 
numbers A, in 0(A) converge to some A € R. Using Point 1 of Propo- 
sition 9.18, show that A € o(A). 


5. Suppose A is a closed operator on H. Show that the kernel of A is a 
closed subspace of H. 


6. Suppose A is a closed operator on H. Define a norm ||-||, on Dom(A) 
by 
oll, = Well + AGI. 


Show that Dom(A) is a Banach space with respect to ||-||,. 
7. Let A be an unbounded operator on H. 


(a) Show that if A is symmetric, then A is also symmetric. 

(b) Show that if B is an extension of A, then A* is an extension of 
B*. 

(c) Suppose A is self-adjoint and B is an extension of A. Show that 
if B is symmetric, then Dom(A) = Dom(B). (That is to say, a 
self-adjoint operator has no proper symmetric extensions.) 


10. 


11. 


12. 
13. 
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. Fix a positive real number pu. 


(a) Show that a symmetric operator A is self-adjoint if and only if 
Range(A + iuJ) and Range(A — iyJ) are equal to H. 


(b) Show that a symmetric operator A is essentially self-adjoint if 
and only if Range(A+ ip) and Range(A— iJ) are dense in H. 


. Let A be the operator considered in Sect.9.6. Using Lemma 9.28, 


show that for each » € C, there exists © € Dom(A*) with A*w = Av. 
Conclude that each  € C belongs to the spectrum of A“. 


Hint: Recall that (A“)* = A*. 
Let A be the operator considered in Sect. 9.6 and suppose w is in the 
domain of A‘. Then there exists a sequence 7, in Dom(A) such that 


wn converges to y in L?({0,1]) and such that Ay, converges to some 
x in L?((0, 1). 


(a) Show that 


din 
Wn (x) = (10021 7) =" (110,0], An) 
for all x € (0, 1]. 


(b) Show that w,, converges uniformly to the function 
W(a) = 4 (Loa) X) 
(c) Conclude that w is continuous and satisfies y(0) = w(1) = 0. 


Take H = L?((0,00)) and let A be the operator —i d/dx, with 
Dom(A) consisting of those smooth functions that are supported on 
a compact subset of (0,00). (Such a function is, in particular, zero on 
(0,¢) for some ¢ > 0.) Show that A is symmetric and that A* +I is 
injective but that A* — iJ is not injective. 


Hint: Imitate the arguments in the proof of Propositions 9.27 and 9.29. 
Prove the second part of Lemma 9.33. 


Let x be a smooth, radial function on R? such that for |x| < 1 we 
have y(x) = 1, for |x| > 2 we have x(x) = 0, and for 1 < |x| < 2, we 
have 0x/0r < 0. Show that 


1 
/ —Ay(x) dx < 0, 
Re [| 


which shows that the Laplacian of 1/ |x], in the distribution sense, is 
not zero. 
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14. 


15. 


16. 
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Hint: Let EF = C,\C2, where C) is a cube centered at the origin with 
side length 3 and where C2 is a cube centered at the origin with side 
length 1/2. Then F contains the support of Ay. Using integration by 
parts on FE, show that 


Ie dues dx = -[. Vv (=) Vx(x) dx. 


Let Dom(A) c L?(R”) denote the domain of the Laplacian, as given 
in Proposition 9.34, and assume n < 3. 


(a) Show that each ~ € Dom(A) is continuous and that there exists 
constants c; and cy such that 


F 


jb(oe)| < er IPbIl + e2 |[hkl°”* |b 


for all » € Dom(A). 


Hint: Show that ob is in L! by expressing w as the product of 
two L? functions. 








(b) Show that for any ¢ > 0, there exists a constant c; such that 


Ox) < ce [bl] + € Ay 
for all » € Dom(A). 


Recall the definitions of Dom(P;) and Dom(A) in Sect. 9.8. Let 
Dom(P?) be the set of all ~ belonging to Dom(P;) such that Pj 
again belongs to Dom(P;). Show that 


( )Dom(P?) = Dom(A). 


ja 


Let Q; denote the restriction to Co°(R”) of the momentum operator 
P;. Show that Dom(Q?) = Dom(P;). Conclude that Q; is essentially 
self-adjoint. 


Let A be an unbounded self-adjoint operator on H and let p be a 
nonzero real number. 


(a) Show that ||(A + iuJ)~'|| < 1/ |u|. Note that (A+ipJ)~! exists, 
by Theorem 9.17. 


(b) Show that for all » € H, 
[lI]? = || A(A + tut)? + 2? (A+ tut) ty". 


Conclude that || A(A + ip)~"|| < 1. 
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18. Let A be an unbounded self-adjoint operator on H. Suppose A is 
non-negative (Definition 9.19) and let A be a positive real number. 


(a) Show that ||(A+AZ)~1|| <1/). 
(b) Show that for all » € H, 


lll? > | ACA + ADD“)? +? |] (A 4D TI). 
Conclude that ||A(A + AZ)~1]| <1. 


19. Prove the last part of Theorem 9.37, concerning domains of essential 
self-adjointness. 


Hint: If A is self-adjoint on Dom(A) and V Cc Dom(A) is a dense 
subspace of H, then A is essentially self-adjoint on V if and only if 
the closure of A|,, is equal to A. 


20. Let A be the operator b- X on the domain C%°(R”), for some b € R”. 


(a) Using the definition of the adjoint of an unbounded operator, 
show that Dom(A*) consists of all those ~ in L?(IR”) for which 
the function (b - x)q(x) again belongs to L?(R"). 


(b) Using Proposition 9.30, show that A is essentially self-adjoint. 
21. (a) Show that a function ¢ € CS°(IR”) can be expressed as ¢ = 
Ox/Ox, for some x € C'S°(R”) if and only if ¢ satisfies 


/ (a1, 22,...,%n) dx; = 0 


for all (w2,...,%p). 
(b) Fix a function y € CS°(R) such that [°° y(x) dx = 1. Show 
that any ¢ € CS°(IR”) can be expressed as 


Ox 


G(x) = f (@e,<++, Pn) ¥(e1) + Baa, 


for some x € CS°(R”), where f is the element of CS°(R"~+) 
given by 


fleays.5tm) = f (a1, 22,---,2n) dx. 


(c) Suppose T is a distribution on R” with the property that 


OT 
aie 
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Define a distribution c on R”~! by the formula 
c(f) — T(f (x2, pees 5 fn) ¥(21)). 


Show that for all ¢ € Cs°(R”) we have 


where ¢ € C®(R"~!) is given by 


b(x9,---, Ln) = f o(er.t2,.--42) dx,. 


22. Let H denote the Schrodinger operator in Theorem 9.41 and let Wo 
be the function defined in Lemma 9.42. 


(a) Show that 
(iat) ~ (Bab) 
2 


=--* im [Pa@vate)|” - Taeea(e)|__ 7 


(b) Now show by direct calculation that (wv, fw) x (Fy, v). 
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The Spectral ‘Theorem for Unbounded 
Self-Adjoint Operators 


This chapter gives statements and proofs of the spectral theorem for 
unbounded self-adjoint operators, in the same forms as in the bounded 
case, in terms of projection-valued measures, in terms of direct integrals, 
and in terms of multiplication operators. The proof reduces the spectral 
theorem for an unbounded self-adjoint operator A to spectral theorem for 
the bounded operator U := (A+ iI)(A —iI)~' (Sect. 10.4). This bounded 
operator is, however, not self-adjoint but rather unitary. Thus, before com- 
ing to the proof of the spectral theorem for unbounded self-adjoint op- 
erators, we prove (Sect. 10.3) the spectral theorem for bounded normal 
operators, those that commute with their adjoints. (A unitary operator U 
certainly commutes with its adjoint U* = U~1.) The proof for a bounded 
normal operator B is the same as for bounded self-adjoint operators, ex- 
cept for the step in which we approximate continuous functions on o(B) 
by polynomials. Since o(B) is not necessarily contained in R, we need to 
use the complex version of the Stone—Weierstrass theorem, which requires 
us to consider polynomials in \ and X. We must then prove a strengthened 
version of the spectral mapping theorem before proceeding along the lines 
of the proof for bounded self-adjoint operators. 

In Sect. 10.2, we discuss Stone’s theorem, which gives a one-to-one corre- 
spondence between strongly continuous one-parameter unitary groups and 
self-adjoint operators. One direction of Stone’s theorem follows from the 
spectral theorem, that is, from the functional calculus that results from the 
spectral theorem. 
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10.1 Statements of the Spectral Theorem 


The statement of the spectral theorem—in any of the forms that we have 
considered—is almost the same for unbounded self-adjoint operators as for 
bounded ones. The only difference is that the statement of the theorem in 
the unbounded case has to contain some description of the domain of the 
operator. 

Recall that if jz is a projection-valued measure on (X,{) with values in 
B(H) and ~ is an element of H, then we can construct a non-negative, 
real-valued measure fi, from jy by setting py(E) = (#, u(E)w), for each 
measurable set F. To motivate the following definition, consider integration 
of a bounded measurable function f against a projection-valued measure ju. 
Since the integral is multiplicative and complex-conjugation of a function 
corresponds to adjoint of the operator, we have 


Par et 


Suppose, now, that f is an unbounded measurable function on X and we 
wish to define [ y f du, which will presumably be an unbounded operator. 
It seems reasonable to define the domain of f to be the set of w for which 
the right-hand side of (10.1) is finite. 


Proposition 10.1 Suppose ys is a projection-valued measure on (X,Q) 
with values in B(H) and f : X > C is a measurable function (not nec- 
essarily bounded). Define a subspace Wy of H by 


W; = {ven [MOP ay) < oo. (10.2) 


Then there exists a unique unbounded operator on H with domain Ws— 
which is denoted by be f du—with the property that 


(w, ( [is an) v)= [ £0) dus 


for all W in Wy. This operator satisfies (10.1) for all b © Wy. 


Note that since jy, is a finite measure for all 7, if f is bounded then the 
domain of f[ x f du is all of H. Thus, in the bounded case, the definition of 
Jy f du in Proposition 10.1 agrees with our earlier definition (in Chap. 7) 
of the integral. This means, in particular, that if f is a bounded function, 
T x f du is a bounded operator. Proposition 10.1 follows immediately from 
the following result. 
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Proposition 10.2 Let f be a measurable function on X and let Wy be as 
in (10.2). Then the following results hold. 


1. The space Wy is a dense subspace of H and the map Q¢ : Wf > C 
given by 


ar) = f £0) duet) 
is a quadratic form on Wy. 


2. If Ly is the associated sesquilinear form on Wy, we have 


L(G. Y)1 < Nl WF llicecx uy) (10.3) 
for all d,y © Wy. 


3. For each € Wy, there is a unique x € H such that L;(¢,) = (¢, x) 
for all @ © Wy. Furthermore, the map Wy ++ x 1s linear and for all 
we Wy, we have 


xl? = i FP. djuy (10.4) 


Proof. It is easy to see that Wy is closed under scalar multiplication. To 
show that it is closed under addition, note that since u(F) is self-adjoint 
and satisfies u(E)? = u(E), we have 


po+u(E) = |e(E)(O + YI? 
S (||u(E)all + e(Byvll)” 


< 2||u(B) OI]? + 2 |le(B)vll? 
= 2u9(E) + 2uy(4), 
where in the third line we have use the elementary inequality (a + y)? < 
Qx? + Qy?. 
To show that Wy is dense in H, let E, = {ct € X| |f(x)| <n}. Ife 
Range(u(E,)), then (£5) = 0, and, thus, 


I fl? ditty = ‘. fl? duy <n? py (En) <00, (10.5) 


showing that w belongs to Wy. Since also U,E, = X, the union of the 
ranges of the (£,,)’s is dense and contained in Ws. 
If f is bounded, Qs may be computed as 


ant) =(w(f tan) ), beH, 
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where [ y f duis as in Chap. 7. Thus, Qf is a quadratic form for which the 
associated sesquilinear form is 


Ls(9,¥) = (6, ([ ra) ¥), 6,0 EH. 


This form satisfies 


zr(6.¥) < hol |( fax) 
= lI6Il I flleece poy: (10.8) 


for all ¢,  € H, where in the second line we have used (10.1). 

If f is unbounded and wy belongs to Wy, let fr = flz,. Then Q;(w) = 
limn—+oo Q +, (w), by monotone convergence, in which case, it is easy to 
see that Qy is still a quadratic form and that (10.6) still holds for all 
@ € H. From (10.6), we see that for each w € Wy, the conjugate-linear 
functional ¢ ++ Ly(¢,w) is bounded. Thus, by (the complex-conjugate 
of) the Riesz theorem, there is a unique vector x such that L;(¢,wW) = 
(¢, x). Furthermore, (10.6) tells us that ||y|] < IF llc2¢x,uy)° Conversely, 
since Lr(¢,~) = (¢,x), (10.6) is an equality when ¢ = yx, showing that 
Ix|| = IF llc2¢x,uy)* Finally, the map ~ +> x is linear because L/(¢@, w) is 
linear inv. 


Proposition 10.3 Jf f is a real-valued, measurable function on X, then 
ie f du is self-adjoint on Wy. 


Proof. Let Ay = fy f du. Define subsets F, of X by 
F,={x#eEX|n-1<|f(x2)| <n}, 


so that X is the disjoint union of the F;,’s, and let W” = Range(p(F;,)). As 
in the proof of Proposition 10.2, any ~ € W” is in Wy, and the quadratic 
form Qy is bounded on W” [compare (10.5)]. Furthermore, if ¢ € (W")+ 
and yw € W”, it is straightforward to check that g+y = ls + Uy and so 


Q(O+¥) = Qp(6) + Qp(Y). (10.7) 


From (10.7), we obtain, by the polarization identity, 


(6, Aph) = Ly(9,¥) = 0. 


This shows that A,z belongs to (W")++ = W”. 

We conclude that Ay maps W” boundedly to itself. Indeed, the restric- 
tion to W” of Ay coincides with the restriction to W” of the bounded 
operator obtained by integrating fle, with respect to ~ (compare the 
quadratic forms). Furthermore, since Q, is real-valued, the restriction of 
Ay to W” is self-adjoint (Proposition A.63). 
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Now, H is the orthogonal direct sum of the W”’s, meaning that H may be 
identified with the set of infinite sequences (v1, We, v3,...) with vw, € W" 


and such that 7 
Y= IIdnll” < 00. 
n=1 


If A, denotes the restriction of Ay to W”, then under this decomposition 
of H, we have 


w= {veu|5 





> |Antnll? < ~ 


= (Ilball? + | Antal?) 2 ~| . (10.8) 


To verify (10.8), we note that 


I diye fu * tig => Antal? 409) 
n=1 


n=1 


-{o= (1, 2,-.-) 





The first equality is by monotone convergence and the second holds because 
[ly = fy, on W”. In particular, the first quantity in (10.9) is finite if and 
only if the last quantity if finite. 

By a similar argument, for ~ € Wy, we have 


= I FO) pty, (A) _ > (tn, AnWn) , 


from which it follows that 
(¢,¥) = a (on, Antpn) 


for all ¢,~ € Wy. From this we see that Aw is the vector represented by 
the sequence (Aj, A2w2,...). It then follows from Example 9.26 that Af 
is self-adjoint. ™ 


Theorem 10.4 (Spectral Theorem, First Form) Suppose A is a 
self-adjoint operator on H. Then there is a unique projection-valued measure 
uA on o(A) with values in B(H) such that 


/ d duA(a) = A. (10.10) 
o(A) 


Since the spectrum of A is typically an unbounded set, the function 
f(A) = A is an unbounded function on o(A). Note also that the equality 
in (10.10) includes, as always, equality of domains. That is, the domain of 
the integral on the left-hand side, namely the space Wy in Proposition 10.1, 
coincides with Dom(A). The proof of this theorem is given in Sect. 10.4. 
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Definition 10.5 (Functional Calculus) For any measurable function f 
on o(A), define a (possibly unbounded) operator, denoted f(A), by 


As usual, we can extend the projection-valued measure 4 from a(A) to 
R by setting u4 equal to zero on the complement of (A). 


Definition 10.6 (Spectral Subspaces) If A is a self-adjoint operator 
on H, then for any Borel set E C R, define the spectral subspace Vz 
of H._ by 

Vr = Range(y“(E)). 


Definition 10.7 (Measurement Probabilities) If A is a self-adjoint 
operator on H, then for any unit vector w © H, define a probability measure 
ws on R by the formula 


wy (E) = (b, wA(E)y) . 


If the operator A represents some observable in quantum mechanics, 
then we interpret To to be the probability distribution for the result of 
measuring A in the state w. 


Proposition 10.8 Let A be a self-adjoint operator on H. Then the spectral 
subspaces Vg associated to A have the following properties. 


1. If E is a bounded subset of R, then Ve C Dom(A), Ve is invariant 
under A, and the restriction of A to Vg is bounded. 


2. If E is contained in (Ayo — €,A0 + €), then for all YW € Ve, we have 
(A — AoL)vI] < ell¥l. 


Proof. Point 1 holds because the function f(A) = » is bounded on E. (See 
the proof of Proposition 10.3.) Point 2 then holds because, as in the proof 
of Proposition 10.3, the restriction of A to Vg coincides with the restriction 
to Vg of the operator f(A), where f(A) = Alz()). @ 


Theorem 10.9 (Spectral Theorem, Second Form) Suppose A is a 
self-adjoint operator on H. Then there is a o-finite measure ts on a(A), 


a direct integral 
® 


H) du(A), 
o(A) 


and a unitary map U from H. to the direct integral such that: 


® 
U(Dom(A)) = ts E / ay HO) / " ASAI, au) < ~| 
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and such that 
(UAU~"(s)) (A) = A8(A) 


for all s € U(Dom(A)). 


Theorem 10.10 Gpectral Theorem,Multiplication Operator Form) 
Suppose A is a self-adjoint operator on H. Then there is a o-finite measure 
space (X, 4), a measurable, real-valued function h on X, and a unitary map 


U:H- L?(X,) such that 
U(Dom(A)) = { € L?(X, pu) |b € L?(X, 4) } 
and such that 
(UAU~*(W))() = h(z)¥(2) 
for all € U(Dom(A)). 


These theorems are also proved in Sect. 10.4. 


10.2 Stone’s Theorem and One-Parameter Unitary 
Groups 


In this section we explore the notion of one-parameter unitary groups and 
their connection to self-adjoint operators. We assume here the spectral 
theorem, the proof of which (in Sect. 10.4) does not use any results from 
this section. 


Definition 10.11 A one-parameter unitary group on H is a family 
U(t), t © R, of unitary operators with the property that U(0) =I and that 
U(s+t) = U(s)U(t) for all s,t € R. A one-parameter unitary group is said 
to be strongly continuous if 


lim ||U (t)) — U(s)e|| = 0 (10.11) 


for ally € H and allt € R. 


Almost all one-parameter unitary groups arising in applications are 
strongly continuous. 


Example 10.12 Let H = L?(R”) and let Ua(t) be the translation operator 
given by 
(Ua(t)v) (x) = p(x + ta). (10.12) 


Then U(-) is a strongly continuous one-parameter unitary group. 
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Proof. It is easy to see that U,(-) is a one-parameter unitary group. To see 
that Ua(-) is strongly continuous, consider first the case in which w is 
continuous and compactly supported. Since a continuous function on a 
compact metric space is automatically uniformly continuous, it follows that 
w(x+ta) tends uniformly to ~(x) as t tends to zero. Since also the support 
of w is compact and thus of finite measure, it follows that w(x + ta) tends 
to w(x) in L?(R"”) as t tends to zero. 

Now, the space C,(R”) of continuous functions of compact support is 
dense in L?(R”) (Theorem A.10). Thus, given ¢ > 0 and w € L?(R"), we 
can find ¢ € C-(R") such that || — || ;2¢@) < €/3. Then choose 6 so that 
||Ua(a)¢ — ¢|| < €/3 whenever |a| < 6. Then given t € R, if |t — s| < 6, we 
have 


|Ualt)y ~~ Ua(s) | 
< ||Ua(t)b — Ua(t)gl| + ||Ualt)e — Vals) ol] + ||Ua(s)o — Vals) 
= ||Ua(t)(% — 4) + [[Ua(s) (Ualt — 8) — 6)|] + [[Uals)(o — Y)I]- (10.13) 


Since Ua(t) and U,(s) are unitary, we can see that each of the terms on the 
last line of (10.13) is less than ¢/3. m 

Note that for a 4 0 the unitary group U,(-) in Example 10.12 is not 
continuous in the operator norm topology. After all, given any « 4 0, we 
can take a nonzero element w of L?(IR”) that is supported in a very small 
ball around the origin. Then Ua(e)v is orthogonal to w and has the same 
norm as w, so that 


||Va(e)y — Va(O)p|] = Vale) — v| = V2 IvI1- 
Thus, ||Ua(e) — Ua(0)|| > V2 for all e 4 0. 


Definition 10.13 If U(-) is a strongly continuous one-parameter unitary 
group, the infinitesimal generator of U(-) is the operator A given by 
LU (t)wy - 
Ay = lim sek ak (10.14) 
t0 2 t 
with Dom(A) consisting of the set of » € H for which the limit in (10.14) 
exists in the norm topology on H. 


The following result shows that we can construct a strongly continuous 
one-parameter unitary group from any self-adjoint operator A by setting 
U(t) = e’4*. Furthermore, the original operator A is precisely the infinites- 
imal generator of U(t). 


Proposition 10.14 Suppose A is a self-adjoint operator on H. and let U(-) 
be defined by 

U(t) = e, 
where the operator e“'4 is defined by the functional calculus for A. Then 
the following hold. 
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1. U(-) is a strongly continuous one-parameter unitary group. 
2. For all x € Dom(A), we have 


Api 


t30 2 t y 


where the limit is in the norm topology on H. 


3. For all wy € H, if the limit 


exists in the norm topology on H, then w € Dom(A) and the limit is 
equal to Aw. 


Proof. Since o(A) C R, the function f(A) := e is bounded on o(A) and 


satisfies f(A) f(A) = 1 for all \ € o( A). Thus, the operator f(A) is bounded 
and satisfies 


FA)FAY® = F(A)*F(A) =F, 


which shows that f(A) = e“4 is unitary. The multiplicativity of the func- 
tional calculus then tells us that U(-) is a one-parameter unitary group. To 
see that U(t) is strongly continuous, note that 


Ub — U(s)vll? = (Y, UW)* — U(s)*)(U) — U(s))¥) 


=| eee dui} (d). (10.15) 


The integral on the right-hand side of (10.15) tends to zero as s approaches 
t, by dominated convergence. 

For Point 2, from recall from Theorem 10.4 that A = [°° \ du4(A), and 
take w € Dom(A). Then, by (10.4), we have 


2 , 
OO: 7 wtrA _ 4 
=| FS = 
ee, |? t 


If we write the function e’* — 1 as the integral of its derivative with respect 
to A, starting at A = 0, we can see that tem —1)/t| < A. Meanwhile, 
since w is in the domain of the operator A = [°  du4(A), we have 


2 


1U@y-4 dwA(r). (10.16) 


a t 











— Ay) 











CS d? dus} (A) < oo. Thus, we may apply dominated convergence, with 


4)? as our dominating function, to show that the right-hand side of (10.16) 
tends to zero as ¢t tends to zero. 
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For Point 3, let B be the infinitesimal generator of U(-). If é and w belong 
to Dom(B), then 


U(t)y — oe) 


t 


‘UWs—6 


ieee 


= (Bd,Y). 
Thus, B is symmetric. On the other hand, Point 2 shows that B is an 
extension of A, so by Exercise 7 in Chap.9, B = A (with equality of 
domain). 


Theorem 10.15 (Stone’s Theorem) Suppose U(-) is a strongly contin- 
uous one-parameter unitary group on H. Then the infinitesimal generator 


A of U(-) is densely defined and self-adjoint, and U(t) = eA for allt ER. 


If U(-) is a strongly continuous one-parameter unitary group, then U(-) 
is continuous in the operator norm topology if and only if the infinitesimal 
generator of U(-) is a bounded operator (Exercise 1). As Example 10.12 
suggests, most one-parameter unitary groups that arise in applications are 
not continuous in the operator norm topology. 

Before giving the proof of Stone’s theorem, let us work out the generator 
of the group in Example 10.12. 


Example 10.16 Jf Ua(-), a € R”, is the strongly continuous one- 
parameter unitary group in Example 10.12, then each w € CS°(R”) is in 
the domain of the infinitesimal generator A of Ua(-) and for all such w, we 
have 


Ap =-i>> age (10.17) 
j 


Furthermore, A is essentially self-adjoint on CS°(R"). 


Proof. The formula for the infinitesimal generator is easy to establish for 
w in CS°(R”). The essential self-adjointness of A is a special case of Propo- 
sition 13.5 (the proof of which is similar to the proof of Proposition 9.29). 
rT] 

We now establish two intermediate results before coming to the proof of 
Stone’s theorem. 


Lemma 10.17 Let U(-) be a strongly continuous one-parameter unitary 
group and let A be its infinitesimal generator. If W € Dom(A), then for all 
t ER, the vector U(t) belongs to Dom(A) and 


fn UE the —UOv _ 
h—-0 h ~ 





U(t) Ap = iAU(t). (10.18) 
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Note that Lemma 10.17 tells us that the curve w(t) := U(t)~o in H 
satisfies the differential equation 


dy. 
da iAy(t) 


in the natural Hilbert space sense, provided that wo belongs to Dom(A). 
This result, together with Proposition 10.14, tells us that if zy) € Dom(#), 
then the curve y(t) := e~#/"up indeed solves the Schrédinger equation 
in the Hilbert space sense. 


Proof. We compute that 


U(t+h)p —U(t)y 
h 





(10.19) 





Since w € Dom(A), the limit as h tends to zero of (10.19) exists and is 
equal to iU(t) Aw. On the other hand, 


U(ttt+hjp—-Ut)p _ U(h)U(t)b) — Ut) 
h h , 





Thus, the limit as h tends to zero of (10.19) is, by the definition of A, equal 
to iA(U(t)w). This shows that U(t)w is in the domain of A and establishes 
the second equality in (10.18). m 


Lemma 10.18 For any strongly continuous one-parameter unitary group 
U(-), the infinitesimal generator A is densely defined. 


Proof. Given any continuous function f of compact support, define an 
operator By by setting 


By =i. f(r)U(r) dr. 


Here, the operator-valued integral is the unique bounded operator such 
that 


(9, Bry) = / f(r) (¢,U(r)) dr. (10.20) 
[It is easy to see that right-hand side of (10.20) defines a bounded sesquilin- 


ear form, for each fixed f € CS°(R).] 
Using the group property of U(-), we see that 


UB B= f ” URDU (r + Ab — Fl)U (rw) dr 


=] ” Lr — 8) — FU w ar, 


—co 
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where in the second line, we have made a change of variable in the first 
term in the integral. From this, we easily obtain that 


lim ue? i Be he f'(r)U (7) dr. 


t—+0 





This shows that Byw is in the domain of A for all W € H and f € Co°(R). 

Now choose a sequence f, € CS°(R) such that f, is non-negative and 
supported in the interval [—1/n,1/n] and such that [°° f(r) dr = 1. 
Then for any ~ € H, we have 


B;,—o= / fa(t)[Un(r)b — v] dr 
so that 


B;,¥—¥ll < a * fala) |UCb— vl] dr 
< sup |U(r)p—yl. 


—1/n<r<l1/n 


Since U(-) is strongly continuous, we see that By, converges to y as 
n — co. Thus, every element of H can be approximated by vectors in the 
domain of A. 

Proof of Theorem 10.15. Suppose U(-) is a strongly continuous one- 
parameter unitary group and A is its infinitesimal generator. By Lemma 
10.18, A is densely defined. As shown in the proof of Proposition 10.14, A 
(denoted by B in that proof) is symmetric. 

Next, we show that A is essentially self-adjoint. Suppose now that w 
belongs to the kernel of A* — iI, ie., A*w = iw. Given ¢ € Dom(A), 
set y(t) = (U(t)d, wv), so that |y(t)| < ||d]| |e]. On the other hand, we 
expect that U(t) = e’4*, so that U(t)* should be e~*4"*. Thus, y(t) should 
(formally) be equal to (¢, e*w). If this is correct, then since y(t) is a bounded 
function of t, we must have (¢,W) = 0. Thus, w would be orthogonal to 
every element of a dense subspace of H, showing that ~ = 0. We could 
then similarly argue that ker(A* + iJ) = {0}, which would show that A is 
essentially self-adjoint. 

To make the argument rigorous, we apply Lemma 10.17, giving 


{ 9,8) = WAU), #) = WU, A*Y) 


dt 
= (iU(t)6, 18) = (U(),¥). 
Thus, the function y(t) := (U(t)@,~) satisfies the ordinary differential 
equation dy/dt = y. The unique solution to this equation is y(t) = y(O)e* 


Since y is bounded, we must have 0 = y(0) = (¢,) for all ¢ € Dom(A), 
which implies that y = 0. Thus, ker(A* — iJ) = {0}, and by a similar 
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argument ker(A* + iJ) = {0}. This shows (Corollary 9.22) that A is essen- 
tially self-adjoint. 

We can now construct a strongly continuous unitary group V(-) by set- 
ting V(t) = e'4"'. To show that V(-) = U(-), take » € Dom(A) C 
Dom(A“) and set w(t) = U(t)a — V(t)w. By Proposition 10.14, the in- 
finitesimal generator of V(-) is A“. Thus, applying Lemma 10.17 to both 
U(-) and V(-), we have 


d 
ee) = iAU (t)d —iAV(t)y 
= iAwi(t), 
where the limit defining dw/dt is taken in the norm topology on H. Thus, 


& |rw(t)|? = (éAw(E),w(t)) + (w(t), eAro(t) 
= —i (Aw(t), w(t)) +7 (w(t), Aw(t)) 


= 0, 


because A is symmetric. Since also w(0) = 0, we conclude that w(t) = 0 
for all t. Thus, U(-) and V(-) agree on a dense subspace and hence on all 
of H. 

We now know that U(t) = e’4"*. It then follows from Points 2 and 
3 of Proposition 10.14 that the infinitesimal generator of U(-) (namely 
A) is precisely A®. That is, A = A and U(t) = e’4*. Furthermore, we 
have already shown that A is essentially self-adjoint and we now know 
that A = A“, so A is actually self-adjoint. Finally, if B is any self-adjoint 
operator for which U(t) = e’?', then by Proposition 10.14, B must be the 
infinitesimal generator of U(-), i... B= A. @ 


10.3 The Spectral Theorem for Bounded Normal 
Operators 


We are going to prove the spectral theorem for an unbounded self-adjoint 
operator by reducing it to the spectral theorem for a bounded operator. 
The reduction, however, will not be to a bounded self-adjoint operator, but 
rather to a unitary operator. Although we proved the spectral theorem only 
for bounded self-adjoint operators, the theorem applies more generally to 
bounded normal operators. (See Exercise 4 in Chap. 7 for the matrix case.) 


Definition 10.19 A bounded operator A on H is normal if A commutes 
with its adjoint: AA* = A*A. 


Every bounded self-adjoint operator is obviously normal. Other examples 
of normal operators are skew-self-adjoint operators (A* = —A) and unitary 
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operators (UU* = U*U = I). The spectrum of a bounded normal operator 
need not be contained in R, but can be an arbitrary closed, bounded, 
nonempty subset of C. On the other hand, if U is unitary, then the spectrum 
of U is contained in the unit circle (Exercise 6 in Chap. 7). 

In this section, we consider the spectral theorem for a bounded normal 
operator A. The statements of the two versions of the theorem are precisely 
the same as in the self-adjoint case, except that o(A) is no longer necessarily 
contained in the real line. Almost all of the proofs of these results are the 
same as in the self-adjoint case; we will, therefore, consider only those steps 
where some modification in the argument is required. 


Theorem 10.20 Suppose A € B(H) is normal. Then there exists a unique 
projection-valued measure 4 on the Borel o-algebra in o(A), with values 
in B(H), such that 


i d dpA() = A. 
o(A) 


Furthermore, for any measurable set E C (A), Range(u4(B)) is invariant 
under A and A*. 


Once we have the projection-valued measure p4, we can define a func- 
tional calculus for A, as in the self-adjoint case, by setting 


f(A) = / yf) who) 


for any bounded measurable function f on o(A). 
We can also define spectral subspaces, as in the self-adjoint case, by setting 


Vip := Range(u4(E)) 


for each Borel set EC o(A). These spectral subspaces have precisely the 
same properties (with the same proofs) as in Proposition 7.15, with the 
following two exceptions. First, the assertion that Vg is invariant under A 
should be replaced by the assertion that Ve is invariant under A and A*. 
Second, in Point 2 of the proposition, the condition EC [Xo — €, Ao + €] 
should be replaced by E C D(Xo,¢), where D(z,r) denotes the disk of 
radius r in C centered at z. 

Meanwhile, the spectral theorem in its direct integral and multiplica- 
tion operator versions also holds for a bounded normal operator A. The 
statements are identical to the self-adjoint case, except that we no longer 
assume o(A) C R and we no longer assume that the function h in the 
multiplication operator version is real valued. 

Let us recall the two stages in the proof of the spectral theorem (first 
version) for bounded self-adjoint operators. The first stage is the construc- 
tion of the continuous functional calculus. The steps in this construction are 
(1) the equality of the norm and spectral radius for self-adjoint operators, 
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(2) the spectral mapping theorem, and (3) the Stone—Weierstrass theorem. 
The second stage is a sort of operator-valued Riesz representation theo- 
rem, which we prove by reducing it to the ordinary Riesz representation 
theorem using quadratic forms. In generalizing from bounded self-adjoint 
to bounded normal operators, the second stage of the proof is precisely the 
same as in the self-adjoint case. In the first stage, however, there are some 
additional ideas needed in each step of the argument. 

There is a relatively simple argument that reduces the equality of norm 
and spectral radius for normal operators to the self-adjoint case. Mean- 
while, since the spectral mapping theorem, as stated in Chap. 8, already 
holds for arbitrary bounded operators, it appears that no change is needed 
in this step. We must think, however, about the proper notion of “polyno- 
mial.” For a general normal operator A, the spectrum of A is not contained 
in R, and, thus, powers of \ are complex-valued functions on o(A). We 
must, therefore, use the complex-valued version of the Stone—Weierstrass 
theorem (Appendix A.3.1), which requires that our algebra of functions be 
closed under complex-conjugation. This means that we need to consider 
polynomials in \ and , that is, linear combinations of functions of the 
form. A". 

What we need, then, is a form of the spectral mapping theorem that 
applies to this sort of polynomial. On the operator side, the natural coun- 
terpart to the complex conjugate of a function is the adjoint of an opera- 
tor. Thus, applying the function \”"\” to a normal operator A should give 
A™(A*)". The desired “spectral mapping theorem” is then the following: 
If p is a polynomial in two variables, and A is a bounded normal operator, 
then 

o(p(A, A*)) = {p(A,A)| A € o(A)} . (10.21) 


This statement is true (Theorem 10.23), but its proof is not nearly as 
simple as the proof of the ordinary spectral mapping theorem. One way 
to prove (10.21) is to use the theory of commutative C*-algebras, as in 
[33]. (See Theorem 11.19 in [33] along with the assertion on p. 321 that 
the spectrum of an element is independent of the algebra containing that 
element.) Another approach is the direct argument found in Bernau [3], 
which uses no fancy machinery but which is long and not easily motivated. 
A third approach is to use the spectral theorem for bounded self-adjoint 
operators to help us prove (10.21); this is the approach we will follow. 

We begin with the equality of norm and spectral radius and then turn 
to (10.21). 


Proposition 10.21 Jf A € B(H) is normal, then 
|All = RCA). 
Lemma 10.22 Jf A and B are commuting elements of B(H), then 
R(AB) < R(A)R(B). 
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Proof. If A is any bounded operator, the proof of Lemma 8.1 shows that 
for any real number T with T > R(A), we have 


in WAT 


m—oo J'™m 


0. 


If A and B are two commuting bounded operators and S and T are two 
real numbers, with S > R(A) and T > R(B), then 


(AB) _ Av B™ | — AMUN B™ 


SmrpTm SmrTm = Smpm 





Thus, 
. ||(AB)"™ II 
l ——__—— = 0. 10.22 
Meanwhile, if we apply the expression for the resolvent in the proof of 
Lemma 8.1 to AB, we obtain 


oS mpm 
we 


\mtt ? 
m=0 


(AB — 2) (10.23) 


since A and B commute. For any 1 with |Ai| > R(A)R(B), take Az with 
|Ai| > |Aa| > R(A)R(B). The terms in (10.23) with A = Ag tend to zero 
by (10.22), which means that (10.23) converges with A = \;. Thus, A; is 
in the resolvent set of AB. m 

Proof of Proposition 10.21. For any bounded operator, ||A|| > R(A) 
(Proposition 7.5). To get the inequality in the other direction, recall (Propo- 
sition 7.2) that ||Al|” = ||A* Al]. Note also that A*A is self-adjoint, since its 
adjoint is A*A** = A* A. Thus, if A and A* commute, we have 


|All? = |A* Al] = R(A*A) < R(A*)R(A) 
< ||A*]| RCA) = |All RCA). 


Here we have used Lemmas 8.1 and 10.22 and the general inequality be- 
tween norm and spectral radius. Dividing by || Al] gives || A|] < R(A), unless 
|| A|| = 0, in which case the desired inequality is trivially satisfied. m 


Theorem 10.23 Jf A € B(H) is normal, then for any polynomial p in two 
variables, we have 


o (p(A, A*)) = { p(, A)| AE a(A)}. 


If, for example, p(A, 4) = 73, then p(A, A*) = A?(A*)?. Note that since 
Aand A* are assumed to commute, the map sending the polynomial p(A, A) 
to p(A, A*) is an algebra homomorphism. That is to say, (pq)(A,A*) = 
p(A, A*)q(A, A*). This would not be the case if A did not commute with A*. 
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We begin by proving Theorem 10.23 in the case that A is a normal 
matrix. Although the matrix case is quite simple, it provides an outline for 
our assault on the general result. 

Proof of Theorem 10.23 in the Matrix Case. For matrices, the spec- 
trum is nothing but the set of eigenvalues. If A commutes with A*, then 
for any AEC, 


((A* — AD), (A* — AD) = (a, (A — AT) (A* = AD) 
(ep, (A* — AT)(A — ADDY) 


= ((A— AD, (A—ADY) (10.24) 


I 


Thus, if 7 is an eigenvalue for A with eigenvalue \, ~ is automatically 
an eigenvalue for A* with eigenvalue \. It then easily follows that w is an 
eigenvector for p(A, A*) with eigenvalue p(A, A). 

In the other direction, suppose jz is an eigenvalue for p(A, A*) and let W 
denote the p-eigenspace for p(A, A*). Since A and A* commute with each 
other, they also commute with p(A, A*). Thus, A and A* preserve W, as 
is easily verified, and the operator Aly, will have some eigenvector ~ with 
eigenvalue A. Since Aw = AW), then, as in (10.24), A*y = Aw and so 


p(A, A*)p = pO, d)yp. 


Since also p(A, A*)w = pb, by assumption, we have p = p(A, A), where A 
is an eigenvalue for A. 

We now attempt to run the same argument for a bounded normal op- 
erator on H, replacing “eigenvector” with “almost eigenvector,” where ~# 
is an e-almost eigenvector for w if ||(A —AL)w|| is less than ¢ ||q)||. The 
main difficulty with this approach is that for a given eigenvalue A, the set 
of c-almost eigenvectors is not a vector space. To surmount this difficulty, 
we will use the spectral theorem for the self-adjoint operator B* B, where 
B = p(A, A*) — wl, with p € o(p(A, A*)). We will construct a spectral 
subspace W for B*B such that W is invariant under A and A* and such 
that each element of W is an e-almost eigenvector for p(A, A*) with eigen- 
value pu. (Note, however, that we are not claiming that W contains all the 
e-almost eigenvectors for p(A, A*).) 


Definition 10.24 If A € B(H), then an c-almost eigenvector for A 
with eigenvalue \ € C is a nonzero vector p € H such that 


(A — AT)oI] < €[l4I/- 


We now establish three lemmas about almost eigenvectors, the last of 
which makes use of the spectral theorem for bounded self-adjoint operators. 
With these lemmas in hand, we will have a clear path to imitate the proof 
of the matrix case of Theorem 10.23. 
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Lemma 10.25 Suppose A € B(H) is normal. 


1. If is an e-almost eigenvector for A with eigenvalue A, then w is an 
€-almost eigenvector for A* with eigenvalue X. 


2. A number A € C belongs to o(A) if and only if for all ¢ > 0, there 
exists an €-almost eigenvector with eigenvalue X. 


Proof. Point 1 follows immediately from (10.24), which holds for bounded 
normal operators, not just matrices. For Point 2, suppose that an ¢-almost 
eigenvector with eigenvalue \ exists for all ¢ > 0. Then A— AJ cannot have 
a bounded inverse, and so \ € o(A). In the other direction, if there is some 
€ > 0 for which no ¢-almost eigenvector exists, then 


(A — AD Yl] 2 €Ilvll (10.25) 


for all » € H, showing that A — AI is injective. By (10.24), the same 
inequality hods with A— XI replaced by A* — XI. Thus, A* — AI is injective, 
so by Proposition 7.3, the range of A — AI is dense in H. Using (10.25) as 
in the proof of Proposition 7.7, it is easily seen that the range of A — AI is 
also closed, hence all of H. Thus, (A — XJ) is invertible and the inverse is 
bounded, by (10.25). m 


Lemma 10.26 Suppose A € B(H) is normal. Then for each polynomial 
p in two variables and each number X € C, there is a constant C’ such 
that if w is an €-almost eigenvector for A with eigenvalue Xr, then w is a 
(Ce)-almost eigenvector for p(A, A*) with eigenvalue p(X, A). 


Proof. We decompose p(A, A*) — p(A,A)I into a linear combination of 
terms of the form A*(A*)!— \*)! and we estimate such terms by induction 
on k +1. If k = 1 and / = 0, there is nothing to prove, and if k = 0 and 
1 = 1, we use (10.24). Assume now that we have established the desired 
result for k +1 = N and consider a case with k+1=N+1. If k > 0, we 
write 


+ (ART (A*)! = ARENT) (10.26) 


Since w is an ¢-almost eigenvector and A and A* are bounded, the norm of 
the first term on the right-hand side of (10.26) is at most c1e. By induction, 
the norm of the second term on the right-hand side of (10.26) is at most 
|A| coe. Thus, the norm of the left-hand side of (10.26) is at most (cy + 
|A| c2)e. A similar analysis holds if k = 0, in which case! > 0. m 


Lemma 10.27 Let A € B(H) be normal, let p be a polynomial in two 
variables, and let yu be an element of the spectrum of p(A, A*). Then for 
all e > 0, there exists a nonzero closed subspace W* of H such that W* is 
invariant under A and A* and such that every nonzero element of W* is 
an €-almost eigenvector for p(A, A*) with eigenvalue p. 
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Proof. Fix some p in the spectrum of p(A, A*) and let B = p(A, A*) — wl. 
Then B is normal and 0 belongs to the spectrum of B. Using Point 2 of 
Lemma 10.25 and Lemma 10.26, we see that 0 belongs to the spectrum of 
the self-adjoint operator B* B. We apply the spectral theorem to B*B and 
we let W* be the spectral subspace for B* B corresponding to the interval 
(—e?,e?). By Proposition 7.15, W® is nonzero and invariant under B*B, 
and the restriction of B* B to W* has norm at most €”. Thus, for all) € W® 
we have 


(By, BY) = (b, B* BY) < |v ||B* Bull < e? lvl 


Since B = p(A, A*) — pI, this shows that every nonzero element of W* 
is an c-almost eigenvector for p(A, A*) with eigenvalue yu. Furthermore, A 
and A* commute with B*B and thus they preserve each spectral subspace 
of B* B (Proposition 7.16) including W*. m= 

Proof of Theorem 10.23. Suppose first that \ belongs to the spectrum of 
A. By Point 2 of Lemma 10.25, A has ¢-almost eigenvalues with eigenvalue 
A for every ¢ > 0. Lemma 10.26 then shows that p(A, A*) has (Ce)-almost 
eigenvectors with eigenvalue p(A,A) for every ¢ > 0, which shows that 
p(A, A) is in the spectrum of p(A, A*). 

In the other direction, suppose that yz is in the spectrum of p(A, A*). 
For any ¢ > 0, we consider the nonzero subspace W* in Lemma 10.27, 
which is invariant under A and A*. The restriction of A to W® is again a 
normal operator (Exercise 8), and Aly. has nonempty spectrum (Propo- 
sition 7.5). If we fix some \ € o( Aly-), Lemma 10.25 tells us that there 
exists an ¢-almost eigenvector = for A in W*. By Lemma 10.26, w is a (Ce)- 
almost eigenvector for p(A, A*) with eigenvalue p(A, A). Meanwhile, since 
w € W®, the same vector w is also an e-almost eigenvector for p(A, A*) 
with eigenvalue p. It then is easy to see (Exercise 10) that 


|u — p(A, r)| < Cet+e. (10.27) 


Since (10.27) holds for all ¢ > 0, we can find a sequence 2, of points in 
o(A) such that p(An, An) — pt. Since o(A) is compact, we can pass to a 
subsequence of the X,,’s that is convergent to some  € o(A), and this A 
will satisfy p(A,A) =. & 

Combining Theorem 10.23 with the equality of the norm and spectral 
radius for normal operators (Proposition 10.21), we have the following re- 


sult. If A € B(H) is normal and p is a polynomial in two variables, then 


IIp(A, A*)|| = sup |pQ,d)|. 
AEa(A) 


The map p +> p(A, A*) has the property that p(A,A*) = (p(A, A*))*, 
where the polynomial p is the complex-conjugate of p. In particular, if p 
takes only real values on o(A), then p(A, A*) is self-adjoint. 
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By the complex-valued version of the Stone—Weierstrass theorem (A.12), 
polynomials in \ and \ are dense in C(a(A);C), the space of continuous 
complex-valued functions on a(A). Thus, the BLT theorem (Theorem A.36) 
tells that we can extend the map p +> p(A, A*) to an isometric map of 
C(o(A);C) into B(H). This extension, which we call the continuous func- 
tional calculus for A, has all the same properties as in the self-adjoint case. 

Now that the continuous functional calculus for normal operators has 
been established, the proof of the spectral theorem—in any of its various 
versions—proceeds exactly as in the self-adjoint case. There is no need, 
then, to repeat the arguments given in Chap. 8. 


10.4 Proof of the Spectral Theorem for Unbounded 
Self-Adjoint Operators 


To prove the spectral theorem for an unbounded self-adjoint operator A, 
we will construct from A a certain unitary (and thus normal) operator 
U. We then apply the spectral theorem for bounded normal operators to 
U and translate this result into the desired result for A. To motivate the 
construction of U, consider the function 

eC) poe a (10.28) 


-9 
wrt 





It is a simple matter to check that C maps R injectively onto S'\{1}, with 
inverse given by 

D(u) := ix = -, ue S!\{1}. (10.29) 
Furthermore, we have lim,-,4.. C(a) = 1. The function C(a) in (10.28) is 
the simplest bounded, injective function one can define on R. 

We wish to apply the map C to a self-adjoint operator A. If A is bounded 
and self-adjoint, it is straightforward to check that the operator (A+iI)(A— 
iI)~' is unitary (Exercise 5). Even in the unbounded case, it is possible to 
make sense of the operator U := C(A), and we can recover A from U, by 
(essentially) applying D. The operator U is unitary and is known as the 
Cayley transform of A. 

Recall that if A is self-adjoint, then 7 is in the resolvent set of A and the 
operator (A — iI)~' maps H into Dom(A). 








Theorem 10.28 (Cayley Transform) [fA is a self-adjoint operator on 
H, let U be the operator defined by 


Up = (A+il)(A-il)“!v. 


Then the following results hold. 
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1. The operator U is a unitary operator on H. 
2. The operator U —I is injective. 


3. The range of the operator U —I is equal to Dom(A) and for all y) € 
Range(U — I) we have 


Ay =i(U + IU —I)~*y. (10.30) 


According to Point 2, U — I is injective, while according to Point 3, the 
range of U—I is Dom(A). Thus, in (10.30), the expression (U —)~! refers 
to the inverse of the one-to-one and onto map U — I : H + Dom(A). We 
are not claiming that 1 is in the resolvent set of U. That is to say, (U—I)~! 
is not a bounded operator, unless Dom(A) = H, which occurs only if A is 
bounded. 

Proof. The resolvent operator (A — iJ)~' must be injective, because 


(A-il)(A-il)p=% 


for all x) € H. Furthermore, (A —iI)~! maps H onto Dom(A), because 
y= (Ail) (A — il)y 


for all » € Dom(A). Since —i is also in the resolvent set of A, similar 
reasoning shows that A +I maps Dom(A) injectively onto H. Thus, U is 
the composition of one operator that maps H injectively onto Dom(A) and 
another operator that maps Dom(A) injectively onto H, so that U maps 
H injectively onto H. 

Now, for any ¢ € Dom(A) we have 


((A + 1) 6, (A + i1)d) = (Ad, Ad) + (6, 9) 
because of a familiar cancellation of cross terms. Thus, applying this with 
¢ = (A—il)~!w shows that for any  € H, we have 
Co il) "yp, (A + iD)(A —-i)*y) 
= ((A—il)(A— il)“, (A — il)(A— i) “*Y) 
= (b,%). 


Thus, U is one-to-one and onto and preserves norms and is therefore 
unitary. 
For Point 2, observe that for any ~ € H, we have 
(A+il)(A—il) "tw = ((A-iD) + 2i1)\(A— il)" 
= pt 2i(A—iI)~*y. (10.31) 
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Thus, since (A — iI)~? is injective, we cannot have Uw = ~ unless 7) = 0. 
Finally, for Point 3, (10.31) says that 
U-1=2i(A-il)?, (10.32) 


which means (by the reasoning at the start of the proof) that the range of 
U —I is Dom(A). For 7 € Dom(A), we then have 


(U+DU—-D v= x + D(A—iD 
= [Ati + (Ai) 
= 2Ay, 


which establishes Point 3. m 

We may apply the spectral theorem for bounded normal operators to 
associate a projection-valued measure uw” to U. We will then transfer this 
measure from S'\{0} to R by means of the map D in (10.29) to obtain the 
desired projection-valued measure pi“ for A. 


Proposition 10.29 Let A be a self-adjoint operator on H, let U be the uni- 
tary operator in Theorem 10.28, and let D : S1\{0} > R be as in (10.29). 
Then 

A= D(U), (10.33) 


where D(U) is defined by the functional calculus for U. 


More precisely, D(U) = Jew) D(X) du’ (A), where pw” is the projection- 
valued measure associated to U by the spectral theorem for bounded normal 
operators. Note that by Point 2 of Theorem 10.28, 1 is not an eigenvalue for 
U and thus «” ({1}) = 0. Thus, D is an almost-everywhere-defined function 
on o(U), even if 1 € o(A). As always, the equality in (10.33) includes 
equality of domains, where the domain of 1a D du” is the space Wp in 
Proposition 10.1. 

Proposition 10.29 should certainly be plausible in light of the previously 
established formula (10.30) for A in terms of U. 
Proof. Suppose E is a Borel subset of S'\{0} such that the closure of E 
does not contain 1, and let Vg = Range(u4 (E)) be the associated spectral 
subspace. Then the spectrum of U|,, is contained in E, which means that 
the functions u++ D(u) and u+> 1/(u—1) are bounded on o(U|,,,). Now, 
by comparing the quadratic forms, we can see that D(U)|y,. = D(U|y,.). 
Then by the multiplicativity of the functional calculus for U on bounded 
functions, we have 


(U) 


DU)b =iU + D(U-1) Wy 


for all W € Vg. Thus, by Point 3 of Theorem 10.28, D(U) agrees with A 
on Ve. 
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Meanwhile, if we decompose $1\{0} as the disjoint union of sets Ey, 
for which E,, does not contain 1, then H is the Hilbert space direct sum 
of the subspaces Vz. Now, A and (by Proposition 10.3) D(U) are both 
self-adjoint. Furthermore, these operators agree on the finite direct sum 
of the Vz,’s and they are essentially self-adjoint on this finite sum, by 
Example 9.26. Thus, A and D(U) must be equal (with equality of domain). 


| 
Theorem 10.30 Define a projection-valued measure ju on R by 
yA(B) = ne’ (CB). (10.34) 
Then 
A= fd du (A), (10.35) 
R 


where ” is the projection-valued measure coming from the spectral theorem 
for the bounded normal operator U and C is the map defined in (10.28). 


Proof. If for any ~ € H, we define py (EF) = (ep, pep and similarly define 
us, then we have 


p(B) = 0h (C®)), 


By the abstract change of variables theorem from measure theory, we have 
| ? dug (a) = ‘| D(u)? duly (u), (10.36) 
R S1\ {0} 


since D is the inverse map to C. Thus, the two operators in (10.35) have 
the same domain. Furthermore, if we replace A? by \ and D(u)? by D(u) 
in (10.36), we see that the operators in (10.35) are also equal. m 


Proof of Theorem 10.4. The existence of the desired projection-valued 
measure ji“ is the content of Theorem 10.30. To establish uniqueness, sup- 
pose v“ is a projection-valued measure on o(A) such that f \ dv4(A) = A. 
Consider then the operator C(A) as defined by integration of the function 
c(A) against v4. Arguing as in the proof of Proposition 10.29, we can see 
that CA), computed in this fashion, coincides with the operator U = C(A) 
defined as the product of (A +i) and (A —iI)71. 

Now define a projection-valued measure v" on $1 by setting vY(E) = 
v4(C~1(E)). Then as in the proof of Theorem 10.30, we have fi. u dv” 
(wu) = U. The uniqueness part of the spectral theorem for U (Theorem 10.20) 
then tells us that v” = yw, from which it follows that v4 = 4. = 


Proof of Theorem 10.9. By the direct-integral form of the spectral the- 
orem for U = C(A), there is a family of Hilbert spaces Hy, \ € o(U) c S', 
and a positive, real-valued measure on o(U) such that H is unitarily 
equivalent to Sow) Hy du, in such a way that the operator U corresponds to 
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the map s(A) + As(A). Since 1 is not an eigenvalue for U, either H; = {0} 
or ({1}) = 0. Either way, Hj, is “negligible” in the direct integral. We can 
then define a family of Hilbert spaces K) := Hoy), for A € o(A) C R, and 
a measure v on o(A) given by v(£) = u(C(E)). We may then form the 
direct integral ie, A) K, dv. This direct integral is unitarily equivalent in 
an obvious way to Jew) H) du. We wish to show, then, that Soca) K, dv 
is unitarily equivalent to H in such a way that the operator A corresponds 
to the (unbounded) operator mapping s(A) to As(A). Since the argument 
is similar to that in the proof of Theorem 10.4, we omit the details. 

As in the proof of Theorem 10.4, the uniqueness in Theorem 10.9 can 
be reduced to the uniqueness for the direct-integral form of the spectral 
theorem for U. @ 

The proof of the multiplication operator form of the spectral theorem 
for unbounded operators is similar to the preceding proofs and is omitted. 


10.5 Exercises 


1. (a) If A is a bounded self-adjoint operator, show that U(t) := e'4¢ 
is continuous in the operator norm topology. 


(b) Using the spectral theorem, show that if A is a self-adjoint op- 
erator and o(A) is a bounded subset of R, then A is bounded. 


(c) Suppose A is a self-adjoint operator that is not bounded. Show 
that U(t) := e’4* is not continuous in the operator norm 
topology. 


Hint: Consider w in a spectral subspace of the form V(\,~<,r9+4e); 
where Apo is a point in o(A) with |Ao| large. 


2. Let Pj; be the unbounded self-adjoint operator defined in Sect. 9.8. 
Show that the one-parameter unitary group e””) generated by P; is 
given by 

(e"Fith)(x) = U(x + the,) 


for all ~ € L?(R”), where e; is the jth element of the standard basis 
for R”. 


Hint: First determine the Fourier transform of e’”y, using Propo- 
sition 9.32. 


3. If A is an unbounded self-adjoint operator on H, let us say that a 
family w(t) of elements of H satisfies the equation 


dp 


= = tAv(t) (10.37) 
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in the strong sense if each w(t) belongs to Dom(A) and 


vt +h) — v(t) 
h 











iav(o| =o 


for every t € R. If we define u(t) by w(t) = ep, for some wp € H, 
show that w(t) satisfies (10.37) in the strong sense if and only if wo 
belongs to Dom(A). 


. Suppose A is an unbounded self-adjoint operator and suppose that 
there exists a number y € R and a nonzero vector v € Dom(A) such 
that 


AY — yl < €ll¥ll 


for some € > 0. Show that there exists a number ¥ in the spectrum 
of A such that |y —4| <e. 


Hint: If no such ¥ existed, the function f(A) := 1/|A — y| would 
satisfy |f(A)| < 1/e for all \ € o(A). Consider, then, the operator 
f(A), which is nothing but (A — yI)7?. 


. If A is a bounded self-adjoint operator, show that the operator C(A) 
given by 

C(A) =(A+iI)\(A-il) 
is unitary and that 1 is in the resolvent set of C(A). Show also that 
A can be recovered from C(A) by the formula 


A=i(C(A) + D(C(A) -— D1. 


. Show that Lemma 10.22 is false if we do not assume that A and B 
commute. 


. Let A be a normal matrix and p a polynomial in two variables. Show 
by example that an eigenvector for p(A,A*) is not necessarily an 
eigenvector for A. 


Note: Nevertheless, the proof of the matrix case of Theorem 10.23 
shows that if ys is an eigenvalue for p(A, A*), then there exists some 
eigenvector for p(A, A*) with eigenvalue y that is also an eigenvector 
for A. 


. Suppose A € B(H) and W is a closed subspace of H that is invariant 
under A and A*. 


(a) Show that (Aly,)* = A*ly. 
(b) Show that if A is normal, the restriction of A to W is normal. 
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9. (a) Suppose that H is finite dimensional, A is a normal operator on 
H, and W is a subspace of H that is invariant under A. Show 
that W is invariant under A*. 


(b) Show by example that the result of Part (a) is false if H is infinite 
dimensional. 


10. Given A € 6(H), suppose that the same vector w is an ¢-almost 
eigenvector for A with eigenvalue \ and a 6-almost eigenvector for A 
with eigenvalue pz. Show that |A— | <e+4+ 6. 


ial 


The Harmonic Oscillator 


11.1 The Role of the Harmonic Oscillator 


The harmonic oscillator is an important model for various reasons. In 
solid-state physics, for example, a crystal is modeled as a large number 
of coupled harmonic oscillators. Using the notion of “normal modes,” this 
model is then transformed into independent one-dimensional harmonic 
oscillators with different frequencies. In the quantum mechanical setting, 
the excitations of the different normal modes are called phonons. 

A free quantum field theory is similarly modeled as a family of cou- 
pled harmonic oscillators, except that in the field theory setting we have 
infinitely many of the oscillators. Even interacting quantum field theo- 
ries are often described using the harmonic oscillator raising and lowering 
operators, which are referred to as creation and annihilation operators in 
the context of field theory. 

Our approach to analyzing the harmonic oscillator also introduces the 
algebraic approach to quantum mechanics, in which algebra (commuta- 
tion relations between various operators) substantially replaces analysis 
(differential equations) as the way to solve quantum systems. Most of the 
effort in analyzing the harmonic oscillator occurs in the algebraic sec- 
tion (Sect. 11.2), with the remaining analytic issues being taken care of 
in Sects. 11.3 and 11.4. 
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11.2 The Algebraic Approach 


In this section we will derive as much information as possible about the 
Hamiltonian operator for a quantum harmonic oscillator using only the 
commutation relation between the position and momentum operators, 


[X, P] = th. (11.1) 
Here, as usual, [-,-] denotes the commutator, given by [A, B] = AB — BA. 
We consider, then, a harmonic oscillator with Hamiltonian given by 
~ PP? &k 
A = — + —X?, (11.2) 
2m 2 


where k is a positive constant. Our goal is to see what we can say about 
the eigenvectors and eigenvalues of H using only the fact that X and P are 
self-adjoint operators satisfying (11.1), without making use of the actual 
formulas for these operators. 

To be honest, we are actually assuming certain domain conditions regard- 
ing the operators X and P, in addition to the commutation relation (11.1), 
namely that the vectors w, in Theorem 11.2 are actually in the domain of 
X and P (and thus, also, in the domain of the raising and lowering opera- 
tors). In this section, we follow the usual physics practice of assuming that 
all the vectors we work with are in the domain of all the relevant opera- 
tors. This assumption will turn out to be correct in the case we are actually 
considering, in which X and P are the usual position and momentum op- 
erators on L?(R). (See Sect. 11.4.) It is a more complicated matter to work 
out the domain conditions that must be imposed on two self-adjoint oper- 
ators satisfying (11.1) in order for the argument of the present section to 
be valid. We will come back to this issue in Chap. 14. 

Following, again, the convention in the physics literature, we now elimi- 
nate the spring constant k in favor of the frequency w = Jf k,/m of the cor- 
responding classical harmonic oscillator. [Solutions to Hamilton’s equations 
with classical Hamiltonian H(z, p) equal to p?/(2m) +kx?/2 are sinusoidal 
with frequency \/k/m.] Replacing k by mw?, we may rewrite (11.2) as 





= — (P? + (mwX)’). (11.3) 


We now introduce the lowering operator a, given by 


mwX +iP 
a = ———- 11.4 
V 2hmw ( ) 


and its adjoint a*, the raising operator,” given by 
muwX — iP 
Qhmw 


a= 


(11.5) 
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The reason for the terminology “raising” and “lowering” is that these 
operators raise and lower the eigenvalue for the Hamiltonian, as we will 
see shortly. In the context of quantum field theory, operators very much 
like a and a* are called creation operators and annihilation operators, re- 
spectively, because they map from the n-particle space to either the (n+1)- 
particle space or the (n—1)-particle space, thus “creating” or “annihilating” 
a particle. 

In the world of noncommuting operators, (A— B)(A+ B) does not equal 
A? — B?; rather, 


(A— B)(A+ B) = A? — B? 4 [A,B]. 


Thus, if we compute a*a using (11.1) we get 


1 
a*a = —— ((mwX)? + P? + imw [X, P}) 


2himw 

1 1 1 
= —. (Pp? Peat & 

oo Te ) 2 


From this we obtain 
A 1 
HA = hw («a+ 51) ; 


The 41 on the right-hand side of this expression should be thought of as a 
“quantum correction,” in that there would be no such term in the analogous 
formula for the classical Hamiltonian. 

It suffices to work out the spectral properties (eigenvectors and 
eigenvalues) of a*a. To get back to H, we keep the same eigenvectors and 
simply add 1/2 to the eigenvalues and then multiply by hw. We compute 
that 





[a,a*] = : ((mwX, —iP] + [iP, mwX]) 
2hmw 
1 
ea (hmwI + hmwl) 
a7: (11.6) 


From this, it is easy to compute that 
[a,a*a] =a (11.7) 
[a*,a*a] = —a*. (11.8) 


Now, a*a is self-adjoint (or, at the least, symmetric) because (a*a)* = 


KK 


a*a** = a*a. This operator is also non-negative, because 


(p, aay) = (arb, arp) > 0 


for all ~. We now come to a key computation, which demonstrates the 
utility of the operators a and a*. 
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Proposition 11.1 Suppose that w is an eigenvector for a*a with 
eigenvalue A. Then 


Thus, either aw is zero or ay is an eigenvector for a*a with eigenvalue 
A — 1. Similarly, either a*w is zero or a*w is an eigenvector for a*a with 
eigenvalue \+ 1. That is to say, the operators a* and a raise and lower the 
eigenvalues of a*a, respectively. 
Proof. Using the commutation relation (11.7), we find that 


a*a(aw) = (a(a*a) — a) b = (A — lay. 


A similar calculation applies to a*w, using (11.8). m 
If w is an eigenvector for a*a with eigenvalue \, then 


(b, h) = (, a* arp) = (arp, arp) 2 0, 


which means that > 0. Let us assume that a*a has at least one eigenvec- 
tor w, with eigenvalue 4, which we expect since a*a is self-adjoint. Since 
a lowers the eigenvalue of a*a, if we apply a repeatedly to w, we must 
eventually get zero. After all, if aw were always nonzero, these vectors 
would be, for large n, eigenvectors for a*a with negative eigenvalue, which 
we have seen is impossible. 

It follows that there exists some N > 0 such that a¥ wy 4 0 but a +ty=0. 
If we define wo by 


Wo = aN yp, 


then aw = 0, which means that a*awo = 0. Thus, Wo is an eigenvector for 
a*a with eigenvalue 0. (It follows that the original eigenvalue \ must have 
been equal to the non-negative integer NV.) 

The conclusion is this: Provided that a*a has at least one eigenvector 1, 
we can find a nonzero vector Wo such that 


ary = a* ayy = 0. 


Since a*a cannot have negative eigenvalues, we may call Wo a “ground state” 
for a*a, that is, an eigenvector with lowest possible eigenvalue. We may then 
apply the raising operator a* repeatedly to wo to obtain eigenvectors for 
a*a with positive eigenvalues. 


Theorem 11.2 Jf wo is a unit vector with the property that avo = 0, then 
the vectors 


Vn = (a")"bo, n2 0, 
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satisfy the following relations for all n,m > 0: 


a Yn = Un41 
a apn = nn, 
(Wn, Wm) = M!dnm 
anti = (n+1)vn. 


Let us think for a moment about what this is saying. We have an orthog- 

onal “chain” of eigenvectors for a*a with eigenvalues 0,1,2,...., with the 
norm of y, equal to Vn!. The raising operator a* shifts us up the chain, 
while the lowering operator a shifts us down the chain (up to a constant). 
In particular, the “ground state” wo is annihilated by a. Thus, we have a 
complete understanding of how a and a* act on this chain of eigenvectors 
for a*a. 
Proof. The first result is the definition of w,4, and the second follows 
from Proposition 11.1 and the fact that a*awo = 0. For the third result, 
if n # m, we use the general result that eigenvectors for a self-adjoint 
operator (in our case, a*a) with distinct eigenvalues are orthogonal. (This 
result actually applies to operators that are only symmetric.) 

If n = m, we work by induction. For n = 0, (wo, Wo) = 1 is assumed. If 
we assume (Wp), Un) = n!, we compute that 


(dnti,Pn+1) = (a°Yn, a" Yn) 

= (Wn, aa" Yn) 

= ({n, (a*a + 1)Yn) 
=(n4 

=(n4 


r 1) (Wns Wn) 
HI. 





n 
n 
Finally, we compute that 


aWn41 = aa Wn _ (a*a F I) Wn = (n tr 1)vn, 


which establishes the last claimed result. 

It is now reasonable to ask whether the vectors {y%,}>-, form an 
orthonormal basis for the quantum Hilbert space. Suppose this is not the 
case. If we then let V denote the closed span of the w,,’s, V will be invariant 
under both a and a*. Thus, by elementary linear algebra, the orthogonal 
complement V+ of V will also be invariant under the adjoint operators a* 
and a, and therefore also under a*a. Therefore, we can begin our analysis 
anew in V+, with the result that we will obtain a new ground state ¢y) € Vt 
(satisfying adp = 0) that is orthogonal to the original ground state wo. If, 
then, the closed span of the w,,’s is not the whole Hilbert space, there will 
exist at least two independent solutions of the equation ay = 0. To put this 
claim the other way around, if it turns out that there is only one solution 
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(up to a constant) of ay = 0, then we expect that the vectors obtained by 
applying a* repeatedly to the solution will form an orthogonal basis for our 
Hilbert space. (Because we are glossing over various technical issues having 
to do with the domains of various operators, this conclusion should not be 
regarded as completely rigorous.) 


11.3 The Analytic Approach 


In the preceding section, we analyzed the eigenvectors of the operator a*a 
as much as possible using only the commutation relation [a,a*] = I, which 
follows from the underlying commutation relation [X, P] = ihI. To progress 
further, we must recall the actual formula for the operators a and a*. 
To simplify our analysis, let us introduce the following natural scale of 
distance for our problem: 
D :=4/—. 
mw 


We then introduce a normalized position variable, measured in units of D, 


ES — 11.9 
=F, (11.9) 
so that _ 
a@_fjha 
di V mw dx 


A calculation gives the following simple expressions for the raising and 
lowering operators: 


a* = s («- =) (11.10) 


Note that the constants m, w, and fh have conveniently disappeared from 
the formulas. 

Given the expression in (11.10), we can easily solve the (first-order, lin- 
ear) equation awo = 0 as 


Wo() = Ce-®/?, (11.11) 


If we take C' to be positive, then our normalization condition determines 
its value to be /7/D, by Proposition A.22. (The normalization condition 
is that the integral of |wo|? with respect to de—not di—should be 1.) We 
obtain, then, 

™mw 


bo(2) = 4/— exp { 0}, (11.12) 
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It remains only to apply a* repeatedly to wo to get the “excited states” 
Wn- 


Theorem 11.3 The ground state Wo of the harmonic oscillator is given 
by (11.12). The excited states wp are given by 


Wn = A, Wo (11.13) 


where H,, is a polynomial of degree n given inductively by the formulas 
Ao(z) =1 
7 1 7 ~ dH,,(Z) 
Hy, =—= (2 Al, 7 FZ «YS 
18) = Ze (2eh(@) - 2?) 
Here, & is the normalized position variable given by (11.9). 


The polynomials H,, are essentially (modulo various normalization con- 
ventions) the Hermite polynomials. 
Proof. When n = 0, (11.13) reduces to qo = wo. Assuming that (11.13) 
holds for some n, we compute q,41 as 
1 
V2 
1 Fs Z 1s 
=F (2c2(@) aoe ) Ce®/? = Hnii(£)v0(2), 


qe d = 
— 7* = ns ~ —H°/2 ~ —%°/2 
Vn41 =a Un == (a1a(a)Ce ae [Hn (@)Ce }) 





as claimed. @ 

Figure 11.1 shows the ground state of the harmonic oscillator, along with 
the excited states with n = 5 and n = 30. Each eigenfunction is plotted as 
a function of the normalized position variable x. In each case, the shaded 
region indicates the extent of the classically allowed region, that is, the 
range in which a classical particle with energy E,, can move. Note that 
each wave function decays rapidly outside the classically allowed region. 
In the last image, we can see that frequency of oscillation of the wave 
function is greatest in the middle of the classically allowed region, while the 
amplitude of the wave function is greatest near the ends of the classically 
allowed region. Intuitively, these properties of the wave function reflect that 
a classical particle with energy E,, has largest momentum in the middle of 
the classically allowed region (where the potential is smallest) and that the 
classical particle spends more time at the ends of the classically allowed 
region, since it is moving slowest there. Further development of this sort of 
reasoning may be found in Chap. 15. 


11.4 Domain Conditions and Completeness 


Although the analysis in Sect. 11.2 is typical of what is found in physics 
texts, it is not completely rigorous from a mathematician’s point of view. 


234 11. The Harmonic Oscillator 


_ 


-10 = 10 


5 
0 \ 5 10 
~10 : 10 


FIGURE 11.1. Harmonic oscillator eigenvectors with n = 0, n = 5, and n = 30. 
In each case, the classically allowed region is shaded. 









































The main problem is that the lowing operator a, the raising operator a’, 
and the product operator a*a are all unbounded operators. The difficulty 
in working with unbounded operators is that one constantly has to check 
that a vector is in the domain of the relevant operator before applying that 
operator. For example, suppose we have a vector Wp in the domain of a and 
satisfying awo = 0. We wish to apply the raising operator a* to Wo and we 
then want to argue that 


a*a(a*wo) = a* Vo. 


This is easy enough to verify (as we did in the previous section) provided 
that all vectors are in the domain of the relevant operators. But how do 
we know that wo is in the domain of a*? And even if it is, how do we know 
that a*wo is in the domain of a*a? 
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These concerns are not just theoretical. Consider a general pair of 
operators A and B satisfying [A,B] = ihl. If we try to analyze an op- 
erator of the form aA? + 8B?, for a, 3 > 0, by the methods of Sect. 11.2, 
things can easily go awry, as the counterexample in Sect. 12.2 demonstrates. 
Fortunately, in the case of the ordinary position and momentum operators, 
the putative eigenfunctions ~,, for a*a in Theorem 11.3 are very nice func- 
tions, in the form of a polynomial times a Gaussian. Thus, there is no 
difficulty in verifying that these functions are in the domain of any finite 
product of creation and annihilation operators. It follows that if a and a* 
are given in terms of the usual position and momentum operators and wo 
given by (11.12), the relations in Theorem 11.2 indeed hold. 

In particular, we can see that the w,’s form an orthogonal set of functions 
in L?(R). Showing that they form an orthogonal basis is also not terribly 
difficult. 


Theorem 11.4 The functions 


Wn (a ) = 


“1 (fell 


form an orthogonal basis for the Hilbert space L?(IR). 








The following result is the key to the proof. 


Lemma 11.5 For alla € C, the partial sums of the series 


Co 





OG” 
pom wae 
n! 
n=0 
converge in L?(R) to the function e%®e~®/2, 
Proof. We need to show that 
2 a are 2 : eo. angen F 4 
ak 2/2 #7 /2|| ~#2/2 7 
i Daa -| a dé (11.14) 
n=0 n=N+1 




















tends to zero as N tends to infinity. The integrand on the right-hand side 
of (11.14) tends to zero pointwise. If we can find a suitable dominating 
function, we can use dominated convergence to conclude that the integral 


also tends to zero. We see that 
<( er laa|" en) 
n=0 


2lal|é .— 2? 


[oe} 


S- ag” eh 
n! 


n=N+1 











=e 
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Since this last function certainly has finite integral, dominated convergence 
applies and we are done. @ 

Proof of Theorem 11.4. It is easily seen that the raising and lower- 
ing operators map the Schwartz space S(R) (Definition A.15) into itself. 
Furthermore, it is easy to verify (Exercise 1) that 


de _/, @& 
(Su) a (6, =) ’ 
for all ¢,  € S(R). From this, we can easily verify that for all ¢,w € S(R), 
(, ath) = (a*d, pb) 


and so also 
(9, a" arp) = (a*ag, y) . 

It is evident that both the ground state wo and all the excited states wy, 
occurring in Theorem 11.4 belong to S(R). Thus, the proof of Theorem 11.2 
is indeed valid. We conclude, then, that the w,,’s form an orthogonal set of 
vectors in L?(R) and that they are eigenvectors for H with the indicated 
eigenvalues. 

It remains to show that the ~,’s form an orthogonal basis for L?(R). Let 
V denote the space of finite linear combinations of the w,,’s. Since H,, is a 
polynomial of degree n, it is easily seen that V consists precisely functions 
of the form 7 

b(@) = p(@e* ?, 
where p is a polynomial. 

Lemma 11.5 then shows that e**#e-®*/? belongs to the L?-closure of V 
for all k € R. Thus, if ~ is orthogonal to every element of V, we have 


| ete ®"/2h(z) dz =0 (11.15) 
R 


for all k. Now, since e~*’/2 belongs to L°®(R) M L?(R) and ~ belongs to 
L?(R), their product belongs to L?(R)M L1(R). Thus, (11.15) tells us that 
the L? Fourier transform of e~**/24)(2) is identically zero. Thus, e~® /?¢)(Z) 
must be the zero element of L?(IR), by the Plancherel theorem, and so 
w(Z) = 0 almost everywhere. This shows that V+ = {0}, meaning that V 
is dense in L?(R). m 


11.5 Exercises 


1. Show that for any Schwartz functions ¢ and w, we have 


(9, aw) = (a* Q, w) ? 
as expected. 


Hint: Use integration by parts on the interval [—A, A] and show that 
the boundary terms tend to zero as A tends to infinity. 
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2. Show that the polynomials H,, satisfy the following relations: 


2d 


Hai ly) n/2 


Hy (y) 


and 


Hnss(w) = Se (2uHa(y) ~ nV 2Hn-a(y)) 


Hint: Start with the relation aw, = nWy-1. 


. Establish the following Rodrigues formula for the polynomials H,,: 


. In this exercise, we prove the following claim: The polynomial H,, has 
n distinct real zeros and the zeros of H, “interlace” with the zeros of 
Ay,-1, meaning that there is exactly one zero of H,_, between each 
pair of consecutive zeros of H,. 


(a) Verify the claim for H, and Ho. 


(b) Assume, inductively, that H, and H,—1 have distinct real zeros 
and that the zeros interlace. Show that H,,_; alternates in sign 
at consecutive zeros of H,,. Then show that H,,,, and H,_1 have 
opposite signs at each zero of H,, so that H,,1 also alternates 
in sign at consecutive zeros of H,,. Conclude that H,+1 must 
have at least one zero between each pair of consecutive zeros 
of Hy. 


Hint: Use Exercise 2. 





(c) Show that H,4; and H,,-; have the same sign near too but 
opposite signs at the largest and smallest zeros of H,,. Conclude 
that H,+1 has at least one zero below the smallest zero of H, 
and at least one zero above the largest zero of Hy. 


(d) Conclude that H,,1 has n+1 real zeros that interlace with the 
zeros of Hy. 


5. Let Gn = Wn/||Yn|| be the normalized nth excited state. 


(a) Let X = X/D, where D = (h/mw)!/?. Show that 
: 1 
2 = —_ 
(x ) 2 o=nt 5° 


Hint: Express X in terms of a and a*, using (11.10), and then 
use Theorem 11.2. 
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(b) Show that 


ieee (hota) 


(c) If T and V denote the kinetic energy and potential energy terms, 
respectively, in (11.3), show that 


IZ 
The Uncertainty Principle 


In this chapter, we will continue our investigation of the consequences of 
the commutation relations among the position and momentum operators. 
We will mostly consider a particle in R!, where we have 


[X, P] = ihl. (12.1) 


We have already seen that much of the analysis of the Hamiltonian H 
for the quantum harmonic oscillator (given by c;P? + coX) can be car- 
ried out using only the commutation relation (12.1). There are two other 
main results that can be derived from these commutation relations: the 
Heisenberg uncertainty principle and the Stone-von Neumann theorem. 
The uncertainty principle states that the product of the uncertainty in X 
and the uncertainty in P cannot be smaller than h/2. The Stone-von Neu- 
mann theorem, meanwhile, states that any two self-adjoint operators A 
and B satisfying [A, B] = ihI “look like” several copies of the standard 
position and momentum operators acting on L?(R). Both results are true 
only under certain technical domain conditions, which we will need to ex- 
amine carefully. We discuss the uncertainty principle in this chapter and 
the Stone-von Neumann theorem in the next chapter. 

The uncertainty principle states that for all 7) in L?(R) satisfying certain 
domain conditions, we have 


h 
(AyX)(AuP) 2 5, 
where, for any observable A, we let Ay, A denote the “uncertainty” in mea- 
surements of A in the state w (Definition 3.13). This means that one cannot 
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make both the uncertainty in position and the uncertainty in momentum 
arbitrarily small in the same state w. 

Although we can easily make A,X as small as we want simply be taking 
w to be supported in a small interval, if we do that, AyP will be large. 
Similarly, we can make A,P as small as we like, by taking the momentum 
wave function 7(p) (Sect.6.6) to be supported in a small interval, but 
then A,X will get large. In the idealized limit in which the position wave 
function is concentrated at a single point, w(a) would be a multiple of 
6(a — a) for some a, in which case, the momentum wave function ~(p) 
would be a multiple of e~’?/". In that case, |«(p)|? is constant, meaning 
that the momentum wave function is completely spread out over the whole 
real line. 

This uncertainty principle may be interpreted as saying that it is impos- 
sible to simultaneously measure the position and momentum of a quantum 
particle. After all, we have said (Axiom 4) that if we perform a measure- 
ment of an observable A with a discrete spectrum, then immediately after 
the measurement the state w of the system should be an eigenvector for A. 
If A has a continuous spectrum, this principle is replaced by the require- 
ment that after the measurement, the uncertainty in A should very small. 
If we could measure both the position and the momentum of the parti- 
cle simultaneously with arbitrary precision, then after the measurement, 
both AX and AP would have to be very small, violating the uncertainty 
principle. 

Now, on the scale of everyday life, Planck’s constant is very small. If, 
for example, we measure mass in units of grams, distance in units of cen- 
timeters, and time in units of seconds, then fi has the numerical value of 
1.054 x 10-2”. Thus, on “macroscopic” scales of energy and momentum, it 
is possible for the uncertainties in position and momentum both to be very 
small. But on the atomic scale, the uncertainty principle puts a substan- 
tial limitation on how localized the position and momentum of a particle 
can be. 

In Sect. 12.1, we prove a version of the uncertainty principle for any two 
operators A and B satisfying [A, B] = ihI, under a seemingly innocuous 
assumption on the domains of the operators involved. In Sect. 12.2, how- 
ever, we see that the domain assumptions are not so innocuous after all. 
In that section, we encounter two operators satisfying [A,B] = iiI on a 
dense subspace of the Hilbert space, along with a vector w such that the 
uncertainty in A is finite and the uncertainty in B is zero. The existence 
of such a vector is surely contrary to the spirit of the uncertainty princi- 
ple, even though it does not violate the version of the uncertainty principle 
proved in Sect. 12.1. (The vector a in Sect. 12.2 does not satisfy the domain 
assumptions of Theorem 12.4.) Finally, in Sect. 12.3, we show that for the 
usual position and momentum operators on L?(R), no such counterexam- 
ples occur: If Ay.X and Ay,P are both defined, then (AyX)(AyP) > h/2. 
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In this section, it is essential that we make sure that all vectors are in 
the domains of the various operators we want to apply to these vectors. 
With this concern in mind, we make the following definition. (Compare 
Definition 9.36.) 


Definition 12.1 If A and B are unbounded operators on H, define AB to 
be the operator with domain 


Dom(AB) = {¢ € Dom(B) |Bw € Dom(A) } 
and given by (AB) = A(By). 


Even if Dom(A) and Dom(B) are dense in H, it could happen that 
Dom(AB) is not dense in H. 

Recall (Definition 3.13) that the uncertainty of a symmetric operator A 
in a state w is defined to be 


2 
(AyA)? = ((A (A), 1) ) (12.2) 
w 
As written, this definition requires that w belong to the domain of (A — 
(A),, 2)’, which is the same as the domain of A*. However, since we assume 
that A is symmetric, then (A),, = (~, Ay) is real, so that A— (A), I is 
again symmetric. Thus, (12.2) can be rewritten as 


(AyA)? = ((A-(A)y DY, (A= (A)y De). 


Having written the uncertainty in this way, it is natural to extend the 
definition of uncertainty to vectors that belong only to Dom(A), as follows. 


Definition 12.2 If A is a symmetric operator on H, then for all unit 
vectors in Dom(A), the uncertainty A,,A of A in the state w is given 
by 

(AyA)? = ((A-(A)y DY, (A- (A)y Dv). (12.3) 


By expanding out the right-hand side of (12.3), we see that the uncer- 
tainty may also be computed as 


(Ay A)? = (Ad, Ad) — ((p, Ay))?. 


[Compare (3.24).] Of course, if ~ happens to be in the domain of A?, then 
Definition 12.2 agrees with (12.2). 


Proposition 12.3 If A is a symmetric operator on H, then for all unit 
vectors ~ € Dom(A), we have Ay A = 0 if and only if is an eigenvector 
for A. 
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Proof. If AyA = 0, then from (12.3), we see that (A — (A),, I) = 0, 
meaning that 7 is an eigenvector for A with eigenvalue (A) ,, . Conversely, if 
Aw = dw for some X, then (7), AW) = A (w, W) = A. Thus, (A-(A),, I) =0, 
which, by (12.3), means that AyA=0. @ 

As discussed in the introduction to this chapter, we expect that imme- 
diately after a measurement of an observable A, the state of the system 
will have very small uncertainty for A. Indeed, if A has discrete spectrum, 
we expect that the state of the system will be an eigenvector for A. Even 
in the case of a continuous spectrum, we expect that the uncertainty in 
A can be made as small as one wishes, by making more and more precise 
measurements. Suppose now that one wishes to observe simultaneously two 
(or more) different observables, represented by operators A and B. In the 
case of a discrete spectrum, the system after the measurement should be 
simultaneously an eigenvector for A and an eigenvector for B. In the case 
where A and B commute, this idea is reasonable. There is a version of 
the spectral theorem for commuting self-adjoint operators; in the case of 
discrete spectrum, it says that two commuting self-adjoint operators have 
an orthonormal basis of simultaneous eigenvectors with real eigenvalues. 
(In the case of unbounded operators, there are, as usual, technical domain 
conditions in defining what it means for two self-adjoint operators to com- 
mute.) 

In the case where A and B do not commute, they do not need to have any 
simultaneous eigenvectors. Certainly, A and B cannot have an orthonormal 
basis of simultaneous eigenvectors, or they would in fact commute. The lack 
of simultaneous eigenvectors suggests, then, that it is simply not possible 
to make a simultaneous measurement of two self-adjoint operators unless 
they commute. In standard physics terminology, the quantities A and B 
are said to be “incommensurable,” meaning not capable of being measured 
at the same time. (See Exercise 2 for a classification of the simultaneous 
eigenvectors of a representative pair of noncommuting operators.) 

In the case of a continuous spectrum, the notion of an eigenvector is 
replaced by the notion of a state with very small uncertainty for the relevant 
operator. In light of our discussion of simultaneous eigenvectors, we may 
expect that for noncommuting operators, it may be difficult to find states 
where the uncertainties of both operators are small. This expectation is 
realized in the following version of the uncertainty principle. 


Theorem 12.4 Suppose A and B are symmetric operators and wy is a unit 
vector belonging to Dom(AB)M Dom(BA). Then 


(AyA)*(AyB)* > 


1 2 
2 7 {4B 


(A,B), 
Note that if % € Dom(AB) then in particular, Y € Dom(B), and if 


w € Dom(BA) then y € Dom(A). Thus, the assumptions on ¢ are sufficient 
to guarantee that A,,A and A,B make sense as in Definition 12.2. 


(12.4) 
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Proof. Define operators A’ and B’ by A’ := A— (w, AW) TI and B’ := 
B — (wW, Bw) I. (We use the same domains for A’ and B’ as for A and 
B, and it is easily verified that A’ and B’ are still symmetric on those 
domains.) Then by the Cauchy—Schwarz inequality, we obtain 


(Alb, A’h) (B'p, B'y) > |(A', B'y)/? (12.5) 
> [Im (Ap, BY) |? (12.6) 
= FAY, BY) — (Be AW. (12-7) 


The assumptions on 7 guarantee that By) € Dom(A) and hence also that 
B'w € Dom(A’), and similarly with A’ and B’ reversed. Since A’ and B’ 
are symmetric, we may rewrite (12.7) as 


(Al), A's) (Bb, BY) > = Kw, A'B') — (, BAW)? 


2 


(v,[4’, BY)". 


A 
4 
Now, since the identity operator commutes with everything, the commu- 
tator of A’ and B’ is the same as the commutator of A and B. Furthermore, 
(A'w, A’w) is nothing but (A,, A)? and similarly for B. Thus, we obtain 


(AyA)*(AyB)* > 


> FWA BW, 


which is what we wanted to prove. @ 
We now specialize Theorem 12.4 to the case in which the commutator is 
ihI and take the square root of both sides. 


Corollary 12.5 Suppose A and B are symmetric operators satisfying 
[A, B] = thI 


on Dom(AB)M Dom(BA). Then ify € Dom(AB)N Dom(BA) is a unit 
vector, we have 


(AyA)(AyB) > 


(12.8) 


In particular, for all unit vectors w € L?(IR) in Dom(X P)NDom(PX), we 
have 


(Ay X)(AyP) = 


Nl oa 


(12.9) 


Note that the factor of appearing on the right-hand side of (12.8) is re- 
ally just |(,[A, B]w)|. Since, however, 7 is a unit vector and [A, B] = if, 
w drops out of the right-hand side of our inequality. We see then that both 
sides of (12.9) make sense whenever A,X and AyP make sense, namely, 
whenever y belongs to Dom(X) and to Dom(P). (Recall Definition 12.2.) 
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On the other hand, the proof that we have given for (12.9) requires w to 
be in both Dom(XP) and Dom(PX). Nevertheless, it is natural to ask 
whether (12.9) holds for all 7 in Dom(X)M Dom(P). We may similarly 
ask whether (12.8) holds for all 7 in Dom(A)M Dom(B). As we will see in 
Sects. 12.2 and 12.3, the answer to the first question is yes and the answer 
to the second question is no. 

Meanwhile, it is of interest to investigate “minimum uncertainty states,” 
that is, states w for which the inequality (12.4) is an equality. 


Proposition 12.6 If A and B are symmetric and w is a unit vector in 
Dom(AB) M Dom(BA), equality holds in (12.4) if and only if one of the 
following holds: (1) w is an eigenvector for A, (2) W is an eigenvector for 
B, or (8) w is an eigenvector for an operator of the form 


A-iyB 
for some nonzero real number y. 


In the case A = X and B = P, we will consider examples where equality 
holds in Sect. 12.4. 
Proof. To get equality in (12.4), we must have equality in both (12.5) 
and (12.6). Equality in (12.5) occurs if and only if A’~ = 0 or B’y =0 or 
A'y = cB'w for some nonzero constant c. If A’ is zero, w is an eigenvector 
for A with eigenvalue (A),,. In that case, equality holds in (12.6) as well. 
Conversely, if w is an eigenvector for A with some eigenvalue 4, then (A), = 
A and A’ = 0. Similarly, B’w = 0 if and only if w is an eigenvector for B. 

Meanwhile, suppose A’y and B’y are nonzero and A’y) = cB’, so that 
equality holds in (12.5). Then equality holds (12.6) if and only if c = 7y for 
some nonzero y € R. Thus, when A’w and B’w are nonzero, we get equality 
in (12.4) if and only if 

Alay = iB" (12.10) 

for some nonzero real number +. Recalling the definition of A’ and B’, 
(12.10) says that 


(A — (, Ay) Db = 1y(B — (eb, BY) I) (12.11) 


or 
(A — iyB)y = du, (12.12) 


where A = (pb, Ay) — ay (wy, By) 7 
Thus, if (12.11) holds, ~ is an eigenvector of A — iyB. Conversely, if 
is an eigenvector for A — iyB with some eigenvalue \ = c+ id in C, then 


(c + id) |d||? = (W, (A — t7B)v) = (eb, AV) — iy (hb, BY). (12.13) 


Since A and B are assumed to be symmetric and w is a unit vector, we 
may equate real and imaginary parts in (12.13) to obtain 
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From this we can see that (12.11) and (12.10) hold, and thus equality holds 
in (12.4). 


12.2. A Counterexample 


In this section, we consider the Hilbert space L?[—1, 1]. As our “position” 
operator, we use the usual formula, 


Note that A is a bounded operator, because we restrict x to the bounded 
interval [—1, 1]. As such, A is defined (and self-adjoint) on the whole Hilbert 
space L?(R). As our “momentum” operator, we again use the usual formula, 


d 
B=-ih—. 
: dx 
As the domain of B we will take the space of continuously differentiable 
functions w on [—1, 1] satisfying the periodic boundary condition, 


w(-1) = W(1). (12.14) 


To verify that B is symmetric, note that for any C! functions ¢ and 1, 
we have 


* dé 
_, dx 








1 

——d 
[ WE ae = FH oa) - Fwy v(o) de, 

-1 

If both ¢ and ~ satisfy the periodic boundary condition (12.14), the bound- 
ary terms cancel out to zero. This shows that the operator d/dx is skew- 
symmetric on Dom(B), from which it follows that —ifd/dx is symmetric 
on Dom(B). Actually, since the functions 


1 ; 
ale) i= —e""™, we Z, 12.15 
Yn(x) Fa (12.15) 
constitute an orthonormal basis of eigenvectors for B with real eigenvalues, 
B is essentially self-adjoint, by Example 9.25. 
Now, for all % € Dom(AB) M Dom(BA) we have, by direct calculation, 


AB — BA = ih), (12.16) 


just as for the usual position and momentum operators. Furthermore, 
Dom(AB) M Dom(BA) is dense in H, since it contains all continuously 
differentiable functions w such that w(0) = ~(1) = 0. Consider, now, the 
function w,(a) in (12.15), for some integer n. Clearly, w, is in the domain 
of B, since Buy, is just a multiple of Wp. Since vp, is an eigenvector for B, 
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the uncertainty of B in the state w,, is zero! Meanwhile, since A is bounded, 
the uncertainty of A is well defined and finite. Thus, Ay, A and Ay, B are 
both unambiguously defined and 


(Ay, A)(Ay, B) = 0. (12.17) 


How can (12.17) hold? Is it not, in light of (12.16), a violation of (12.8) 
in Corollary 12.5? The answer is no, for the reason that 7, does not satisfy 
the domain assumptions in that corollary. Specifically, Aw, is not in the 
domain of B, since Ay, is does not satisfy the periodic boundary condition 
in the definition of Dom(B). Thus, 7, does not belong to Dom(BA). 

Although it does not contradict Corollary 12.5, (12.17) certainly violates 
the spirit of the uncertainty principle. In the next section, we will show 
that no such strange counterexamples occur for the usual position and 
momentum operators. 


12.3 Uncertainty Principle, Second Version 


In this section, we will see that if A and B are taken to be the usual 
position and momentum operators X and P, the uncertainty principle holds 
whenever A,X and A,P are defined. We continue to use Definition 12.2 
for the definition of the uncertainty in any operator, in which case, for 
A,X and A,,P to be defined, we require only that ~ belong to Dom(X) 
and Dom(P). 

We are now ready to formulate the strong version of the uncertainty 
principle. 


Theorem 12.7 Suppose w is a unit vector in L?(R) belonging to Dom(X )n 
Dom(P). Then 


Nilo 


(AyX)(AyP) > 5, (12.18) 


where Ay X and AyP are given by Definition 12.2. 


Proof. According to Stone’s theorem and Example 10.16, the operator P 
is h times the infinitesimal generator of the group U(-) of translations. That 
is to say, for all / € Dom(P), we have 


(Pu)(x) = —ih lim vere Ney 
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where the limit is in the L? norm sense. Thus, 


(Xd, Pu) = lim (Xu, in (#2 +a) - ve) 





a 





a>0 \a 


= lim (; (xq)(x), —iheb(a + a)) + a (x4,¥)) 





: 1.3 ih 
= im, (= Ginty ~ a) 6(y — a), (a) + (X99), 
a>0 \a a 
where in the last step we have made the change of variable y = x + a. 
If we rename the variable of integration back to x, we get 


(Xv, Pv) 
= tim ((inx (PERI M) (oy) 4 im (la — a), ¥(@))) 


= tim ((in (SE= IM) xyioy) 4 ih (a — 0), ¥(@))) 
= (Py, Xp) + ih, %). (12.19) 


In the second equality, we have used that X is symmetric and that (check) 
if % € Dom(X), then (a — a) € Dom(X) for each fixed a. In the last 
equality, we get a minus sign from having v(a# — a) — v(x) rather than 
w(a +a) — (a), and we use that translation is strongly continuous. 

It should be noted that (12.19) is precisely what we would get by formally 
moving X to the right-hand side of the inner product, using the commuta- 
tion relation XP — PX =ifI, and then moving P to the left-hand side of 
the inner product. But to make that calculation rigorous, we would need to 
assume that w is in the domain of XP and the domain of PX. In (12.19), 
on the other hand, we have obtained the desired conclusion assuming only 
that w is in the domain of X and in the domain of P. 

Having obtained (12.19), we can easily verify that for any real constants 
qa and G, we have 


((X — aly, (P — BI)b) = ((P — BI)d, (X — at)y) + ih (bb). (12.20) 
Solving (12.20) for (w,~w) gives 





pre 





(bb) = = ((X — al), (P — BI) — ((P — BY, (X — al)y)) 
Im ((X — al), (P — BI)p) 


ot 


IA 
a) NwoalNMes 


I|(X — at)p|] (P — BY, (12.21) 


by the Cauchy—Schwarz inequality. If w is a unit vector and we take a = 

2 2 
(X),, and 6 = (P),,, then ||(X — af)pl|" = (AyX)? and ||(P — B1)y||" = 
(A,,P)?. Thus, we get 
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1< = (AyX)(AyP), 


which is equivalent to what we want to prove. & 

We know from Sect. 12.2 that the strong form of the uncertainty principle 
does not hold if X and P are replaced by two arbitrary operators satisfying 
AB-BA=ihI on Dom(AB)NDom(BA), even if Dom(AB)NDom(B4) is 
dense in H. Nevertheless, if we look carefully at the proof of Theorem 12.7, 
we can see what assumptions we would need on A and B to make the proof 
go through in a more general setting. 


Theorem 12.8 Suppose A and B are self-adjoint operators on H. Suppose 
that for alla € R andy € Dom(A), we have that e'*? a belongs to Dom(A) 
and that 


Ae? yy = e'? Aw — hae’? yp. (12.22) 


Then for all unit vectors in Dom(A)MDom(B), we have 


(Ay A)(AyB) > 


Nl o 


r 


where Ay,A and AyB are defined by Definition 12.2. 


The relation 
e928 A = Ac™? + fae™?, ae R, (12.23) 


which holds on Dom(A), is a “semi-exponentiated” form of the canonical 
commutation relations. As shown in Exercise 6, there is a formal argument 
(ignoring domain issues) that the commutation relations [A, B] = ihI ought 
to imply the relations (12.22). Nevertheless, as Exercise 7 shows, this formal 
argument does not always give the correct conclusion. In Sect. 14.2, we 
will encounter a “fully exponentiated” form of the canonical commutation 
relations, in which both A and B are exponentiated. 

Proof. See Exercise 5. @ 


Corollary 12.9 For any j =1,...n and any unit vector w € L?(R") with 
wy € Dom(X;) M Dom(P;), we have 


(AyX;)(AyPj) = 5- 


Proof. In the case that A = X; and B = Pj, we have (e!?/"y)(x) = 
w(x +ae;), by Exercise 2 in Chap. 10. Thus, in this case, (12.22) says that 


(x; + a)w(x + ae;) = x;(x + ae;) + av(x + ae;), 


which is true. @ 
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12.4 Minimum Uncertainty States 


In this section, we look at the states that give equality in the uncertainty 
principle. Such states are known as minimum uncertainty states or coher- 
ent states. As in the general setting of Proposition 12.6, the condition for 
a equality is an eigenvector condition. That is to say, even though in The- 
orem 12.7, we allow ~’s that are not Dom(X P)M Dom(PX), we do not 
get any new minimum uncertainty states by this weakening of our domain 
assumptions. 


Proposition 12.10 A unit vector y) € Dom(X)M Dom(P) satisfies 


(AyX)(AyP) = 


Nilo 


if and only if w satisfies 
(X + id5P)p = AW (12.24) 
for some nonzero real number 6 and some complez number X. 


For convenience, we have made the substitution 6 = —y in (12.24) rela- 
tive to Proposition 12.6. 


Rely (x)] 








; x 
1 


FIGURE 12.1. Minimum uncertainty state with (X) = 1, (P) = 0, and 
AK = 1/9, 


Proof. All the relations in the proof of Theorem 12.7 are equalities, except 
for the inequality in the last line of (12.21). Equality will hold in that line 
if and only if one of (X — al)q and (P — B1)y is zero or (P — BI)w isa 
pure-imaginary multiple of (X — al)w. Now, if w is a unit vector in L?(R), 
then neither ~ nor the Fourier transform of w can be supported at a single 
point; thus, neither (X — al)y nor (P — GI)w can be zero. We are left, 
then, with the condition that 


(X — al) = iy(P — BI), (12.25) 
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FIGURE 12.2. Minimum uncertainty state with (X) = 1, (P) = 10, and 
AX = 1/2. 


where is a nonzero real number, a = (A),, and 8 = (B),,. As in the 
proof of Proposition 12.6, (12.25) is equivalent to the assertion that w is 
an eigenvector for the operator X — iyP. Letting 6 = —7y gives the desired 
result. @ 


Proposition 12.11 If the parameter 6 in (12.24) is negative, there are 
no nonzero solutions to (12.24). If the parameter 6 is positive, there exists 
a unique (up to multiplication by a constant) solution w5,, to (12.24) for 
every complex number A. The function w5,, has the following additional 
properties 


(X) = Red 
(P) = zim 
AX 

aE 


Explicitly, we have 


W5,n(x) = c1 exp {-<*} 


= cxexp {ETON ogy {2h 


where all expectation values are taken in the state w5,y. 


Note that among states with (AX)(AP) = h/2, we can arrange for 
AX/AP to be any positive real number, and once we have chosen AX/AP, 
we can then arrange for (X) and (P) to be any two real numbers. On the 
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FIGURE 12.3. Minimum uncertainty state with (X) = 1, (P) = 20, and AX = 1. 


other hand, once AX/AP and (X) and (P) have been specified, there is a 
unique quantum state with (AX)(AP) = f/2. In Figs. 12.1-12.3, we have 
plotted the real part of ws, for several different values of the parameters, 
in a system of units for which fh = 1. 

Proof. The equation (X + idP)w = Aw amounts to 


xy + ane = V(x), (12.26) 
dx 


where y is assumed to be in the domain of P, so that the distributional 
derivative of ~ is an L? function. If y were smooth, then the unique solu- 
tion to (12.26) would be the function ws, given in the proposition, which 
is square-integrable if and only if 6 > 0. Even (12.26) is only assumed 
to hold in the distribution sense, the argument in the proof of Proposi- 
tion 9.29 (with e~*/"¢)(x) replaced by exp[(# — A)?/(26h)|¢)(x)) shows that 
there are no additional solutions. The formulas for (X) , (P) , and AX/AP 
can be computed either by tracing through the arguments in the proof of 
Theorem 12.7 or by direct calculation with the formula for ~s5,,. & 


12.5 Exercises 
1. Let a be a positive real number. Show that the following “additive” 
version of the uncertainty principle holds for all unit vectors w € 
Dom(X)M Dom(P) : 
1 
aAyxX + —AyP > V 2h. 
a 


2. In this exercise, we classify the simultaneous eigenvectors of the non- 
commuting operators J; and J2. Let Ji, J2, and J3 denote the angular 
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momentum operators on L*(R*) as defined in Sect. 3.10. Suppose ~ 
is in the domain of any product J; J, of two angular momentum op- 
erators. (For example, ~ could be a Schwartz function.) Suppose also 
that w is an eigenvector for J; and for Jz with eigenvalues a and 8, 
respectively. 
(a) Using the commutation relations in Exercise 10 in Chap. 3, show 
that w is an eigenvector for J3 with eigenvalue 0. 
(b) Show that the eigenvalues a and § for J; and Jz must be zero. 
(c) What type of function 7 € L?(R°) satisfies Jj = 0 for j = 
1, 2,37 


. Given any unit vector 7 € Dom(X)/M Dom(P), consider another 


vector ¢ given by 

$(x) = e*/Pp(x — a). 
Show that ¢ is a unit vector belonging to Dom(X) M Dom(P) and 
that 


(X)4 =(X)y +4 

AgxX = AyX 
and 

(P)y = (P)y +6 

AgP = AyP. 


4. We have seen that a unit vector y € Dom(X)NDom(P) is a minimum 


uncertainty state [i.e., (Ay X)(AyP) = h/2] if and only if there exists 
some 6 > 0 such that ~ is an eigenvector of the operator X + idP. 
In that case, ~ is also an eigenvector for any operator of the form 
c(X +id6P), with c being a nonzero constant. Consider, then, some 
fixed 6 > 0 and define an operator a by the formula 


5(X + idP) 
a= 2 ¥__. 
2n/d 
Then a is just the annihilation operator, as defined in Chap. 11, for a 
harmonic oscillator with mw = 1/6. Thus, a and its adjoint a* satisfy 
the relation [a,a*] = J, and we have the “chain” of eigenvectors 
Un € L?(R) satisfying the properties listed in Theorem 11.2. 
(a) For any \ € C, find constants c, so that the vector 


[o<) 


r = S- CnWn 


n=0 
is an eigenvector for a with eigenvalue ’. Show that the resulting 
series converges in H. 
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(b) Let @, denote the eigenvector obtained in Part (a), normalized 
so that co = 1. Show that 


‘58 
o,=e do, 


where the exponential is defined by 


with convergence in L?(R). 


5. Prove Theorem 12.8, following the outline of the proof of Theo- 
rem 12.7. Recall from Sect. 10.2 that B/h is the infinitesimal gen- 
erator of the one-parameter unitary group U(a) := e'@?/". 


6. If X and Y are bounded operators, we may define adx(Y) = [X,Y], 
where [X,Y] = XY — YX. Thus, say, (adx)3(Y) = [X, [X, [X, Y]]]. 
It is not hard to show that for any bounded operators Y and X, we 
have 


e*Ye-* = etx (Y) 





(See Proposition 2.25 and Exercise 2.19 of [21].) 


Suppose A and B are unbounded self-adjoint operators satisfying 
[A, B] = ihI on Dom(AB) M Dom(BA). Show that if we could ap- 
ply (12.27) with X = iaB/h and Y = A (even though X and Y are 
unbounded), then A and B would satisfy (12.22). 


7. Let A be the operator in Sect.12.2, and let B be the unique self- 
adjoint extension of the operator B in that section. Show that the 
operators X = iaB/h and Y = A do not satisfy (12.27). 


Note: This result shows the hazards involved formally applying results 
for bounded operators to unbounded operators. 


Hint: Show that the unitary operators U(a) := exp(iaB/h) consist 
of “translation with wrap around,” first on the eigenvectors of B and 
then on the whole Hilbert space. 


13 


(Quantization Schemes for Euclidean 
Space 


13.1 Ordering Ambiguities 


One of the axioms of quantum mechanics states, “To each real-valued 
function f on the classical phase space there is associated a self-adjoint 
operator £ on the quantum Hilbert space.” The attentive reader will note 
that we have not, up to this point, given a general procedure for con- 
structing Fi from f. If we call f the quantization of f, then we have only 
discussed the quantizations of a few very special classical observables, such 
as position, momentum, and energy. 

Let us now think about what would go into quantizing a (more-or-less) 
general observable. Let us consider for simplicity a particle moving in R* 
and let us assume that quantizations of x and p are the usual position 
and momentum operators X and P. What should the quantization of, say, 
xp be? Classically, ep and px are the same, but quantum mechanically, 
X P does not equal PX. Furthermore, neither X P nor PX is self-adjoint, 
because (X P)* = P*X* = PX, and PX #4 XP. In this case, then, a 
reasonable candidate for the quantization would be 


ae 5(XP + PX). 

The significance of this simple example is that the failure of commuta- 
tivity among quantum operators creates an ambiguity in the quantization 
process. It does not make sense to simply “replace x by X and p by P 
everywhere in the formula,” since the ordering of position and momen- 
tum makes no difference on the classical side, but it does on the quantum 
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side. Up to this point, we have not really had to confront this ambiguity, 
because of the special form of the observables we have quantized. The 
Hamiltonian, for example, is typically of the form H(zx,p) = p?/(2m) + 
V(a). Since each term contains only « or only p, it is natural to quantize 
H to H = P?/(2m)+V(X), where V(X) may be defined by the functional 
calculus or simply as multiplication by V(x). In defining the angular mo- 
mentum operators, we do encounter products of position and momentum, 
but never of the same component of position and momentum. For a parti- 
cle in R?, for example, we have, J = 21p2 — x2p,. On the quantum side, 
X, commutes with P, and X2 with P,, and thus there is no ambiguity: 
XP» = XP, is the same as P2X4 = P, Xo. 

When we turn to the quantization of a general observable, however, 
we must confront the ordering ambiguity directly. Groenewold’s theorem 
(Sect. 13.4) suggests that there is no single “perfect” quantization scheme. 
Nevertheless, there is one that is generally acknowledged as having the best 
properties, the Weyl quantization, and we spend most of our time with 
that particular scheme. Other quantization schemes do also play a role in 
physics, however; Wick-ordered quantization, notably, plays an important 
role in quantum field theory. (In quantum field theory, the replacement of 
certain Weyl-quantized operators with their Wick-quantized counterparts 
is interpreted as a type of renormalization.) 


13.2 Some Common Quantization Schemes 


In this section, we consider several of the most commonly used quantization 
schemes. For simplicity, we limit our attention to systems with one degree 
of freedom and to classical observables that are polynomials in x and p. 
(We consider the Weyl quantization in greater generality in Sect. 13.3.) 
Furthermore, we resolve in this section not to worry about domain questions 
and simply to use CS°(R) as the domain for all of our operators. Thus, 
in this section, equality of operators means equality as maps of C°°(R) to 
itself. It should be noted that the operators of the sort we will be considering 
may very well fail to be essentially self-adjoint, even if they are symmetric. 
Section 9.10 shows, for example, that the operator P? — cX+, for c > 
0, is not essentially self-adjoint on C'S°(R). We follow the terminology of 
harmonic analysis by referring to a classical symbol f as the symbol of its 
quantization f . Once we have discussed each quantization scheme briefly, 
we will formalize the definitions of all the schemes in Definition 13.1. 

The simplest approach to quantization is to choose, once and for all, 
which to put first, the position or the momentum operators. We may, for 
example, choose to put the momentum operators to the right, acting first, 
and the position operators to the left, acting second. In this approach, a 
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polynomial in x and p will quantize to a differential operator in “standard 
form,” with all the derivatives acting first, followed by multiplication oper- 
ators. In harmonic analysis, there is a method for extending this quantiza- 
tion scheme to more-or-less arbitrary symbols, f. For a general (nonpoly- 
nomial) symbol f, the resulting operator f is known as a pseudodifferential 
operator. 

A serious drawback of the pseudodifferential quantization is that even 
when the symbol f is real-valued, the operator f it produces is typically 
not self-adjoint (or even symmetric). If, for example, f(x, p) = xp, then the 
associated operator is X P, the adjoint of which is PX, which is not equal 
to X P. The simplest way to fix this problem is to symmetrize the operator 
by taking half the sum of the operator and its adjoint. 

The Weyl quantization, meanwhile, takes more seriously the possibility 
of different orderings of X and P, by considering all possible orderings. 
Thus, in quantizing, say, xp”, the Weyl quantization will give 


1 
ree +XPXP+XP?X + PX?P + PXPX + P*X?). 


For a general monomial, the Weyl quantization similarly averages all the 
possible orderings of the position and momentum operators. 

For Wick-ordered and anti-Wick-ordered quantization, we no longer 
regard the position and momentum operators as the “basic” operators, 
but rather the creation and annihilation operators. Specifically, given any 
positive real number a, we introduce complex coordinates on the classical 
phase space by 


zZ=2— tap 
Z=ax+ ap. (13.1) 


(Although it would seem more natural to define z to be x + iap, this 
choice would lead to problems later, especially with the Segal-Bargmann 
transform.) We then consider the corresponding quantum operators, which 
we call the raising and lowering operators: 


a” = X —iaP 
a= xX +iaP. (13.2) 


In comparing these operators to the ones defined in the context of the 
harmonic oscillator, we should think of a@ as corresponding to 1/(mw). 
Even with this identification, however, the operators in (13.2) differ by a 
constant from the raising and lowering operators of Chap. 11. [The over- 
all normalization of the raising and lowering operators is not important 
in this context, provided that we are consistent in the normalization be- 
tween (13.1) and (13.2).] In particular, the commutator of a and a* is not 
I but rather 2aAl. 
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In Wick-ordered quantization, we begin by expressing the classical 
observable f in terms of z and Z rather than in terms of x and p. When we 
quantize, we put all the lowering operators (coming from the factors of Z 
in f) to the right, acting first, and the raising operators (coming from the 
factors of z in f) to the left, acting second. This approach to quantization is 
useful in quantum field theory, where letting the lowering operators act first 
can cause certain otherwise ill-defined expressions to become well defined. 
In anti-Wick-ordered quantization, we do the reverse, putting the raising 
operators to the right, acting first. Although anti-Wick-ordered quantiza- 
tion seems singular in the context of quantum field theory, in systems with 
finitely many degrees of freedom, it is actually better behaved than Wick- 
ordered quantization. 


Definition 13.1 Define several different quantization schemes for symbols 
that are polynomials in x and p as follows. Each scheme is uniquely 
determined—as a map from polynomials on R? into operators on CS°(R)— 
by the indicated formulas. 
1. Pseudodifferential operator quantization: 
Q(2'p*) = XIP*. 
2. Symmetrized pseudodifferential operator quantization: 


ee See _ 
Q(x) p*) = ger + PF xX). 


3. Weyl quantization: 


1 
ip) = PO. eee. oA anes 2 
Q(x Pp ) Gj + k)! ‘> a ( ? F ? P ? ? ? ), 
CESj+tk 
where for any operators A,, A2,...,An and any o € Sy, we define 
o(Aj, Ag,..-, An) = Ag (1) Ao(2) . “Aginy- (13.3) 


4. Wick-ordered quantization with parameter a: 

Q((a + iap) (a — iap)*) = (X —iaP)*(X +iaP), a>0. 
5. Anti- Wick-ordered quantization with parameter a: 

Q((x + iap)) (x — iap)*) = (X +iaP)i(X —iaP)*, a>0. 


In applications, the most useful quantization schemes are the Wick- 
ordered, anti-Wick-ordered, and Weyl schemes. All of the quantization 
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schemes in Definition 13.1 except the pseudodifferential operator quantiza- 
tion have the property of mapping real-valued polynomials to symmetric 
operators on CS°(R). (See Exercise 3 in the case of the Wick- and anti- 
Wick-ordered quantizations.) 

In comparing the different quantization schemes, it is important to rec- 
ognize that two different expressions may describe the same operator. We 
may calculate, for example, that 


i 1 
sa +P?X)= 5(PXP +[X, PIP + PXP — P[X, P)) 
= PXP. 


9 


since [X, P] is a multiple of the identity and thus commutes with P. As a 
result, we can eliminate the PX P term in the Weyl quantization of xp”, 
with the result that 


1 1 
Qweyi(zp") = g(XP? +PXP+P?X)= sa + P?X), (13.4) 
which coincides, in this very special case, with the symmetrized pseudod- 


ifferential quantization of xp’. 


Example 13.2 If f(z,p) = 27, then the Weyl, Wick-ordered and anti- 
Wick-ordered quantizations of f are as follows: 


Qweyi(z”) = x 


1 
Qwick (x7) = xX? = ght 
1 
Qanti—wick(%*) = X? + sont. 


Proof. The value for Qweyi(x”) is apparent. To compute the Wick- and 
anti-Wick-ordered quantizations, we first write x as (z + Z)/2, so that 
9 (z+27)? 1 


iP eg rica + 227+ 27), 





Thus, we have, for example, 


Qwick(2”) = — ((X — iaP)? + 2(X —iaP)(X + iaP) +(X +iaP)?). 

When we expand this expression out, the P? terms cancel, and the X P 
and PX terms from (X — iaP)? will cancel with the XP and PX terms 
from (X + iaP)*. Thus, we will be left with X? terms and the XP and 
PX terms from the cross-term above: 


Owe) = : (4X? + 2ia[X, P]) . 
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Using the commutation relation between X and P gives the desired result. 
The calculation of Qantiwick(2”) is identical except that the order of the 
factors in the cross-term is reversed, which gives the opposite sign for the 
[X,P] term. @ 


Proposition 13.3 The Weyl quantization—viewed as a linear map of the 
space of polynomials on R? into operators on CS°(R)—is uniquely charac- 
terized by the following identity: 


Qweyi((ax + bp)’) = (aX + bP)! (13.5) 
for all non-negative integers j and alla,b € C. 


Proof. The Weyl quantization is easily seen to satisfy the identity 


Qweyi((aix + bip) +++ (aja + b;p)) 


1 
= S > o(a1X + b1P,...,a,X +,P), (13.6) 
o€S; 
for all sequences a,,...,a; and bj,...,6; of complex numbers, where the 
expression o(-,-,...,-) is defined by (13.3). Specializing to the case where all 


the a,;’s are equal to a and all the b,’s are equal to b gives (13.5). Conversely, 
suppose that Q is any linear map of polynomials into operators on C'S°(R) 
satisfying Q((ax + bp)?) = (aX + bP)s for all a, b, and j. For each j, let 
V; denote the space of homogeneous polynomials f of degree 7 such that 
Q(f) = Qweyi(f). Then V; contains all polynomials of the form (az + bp), 
and thus, by Exercise 1, V; consists of all homogeneous polynomials of 
degree j, so that Q = Qweyl. & 


Proposition 13.4 The Weyl quantization satisfies 


Qweyi(xg) = Qweyi(X)Qweyi(g) — Qwest (54) (13.7) 
ih, 3) 
= QwoloQworle) + Fw (S42) 38) 
and 
Qwonla) = QworlP)Qvayi(a) + FOwen (34) (13.9) 
= Qweyi(9)Qweyi(P) — wes (52) (13.10) 


for all polynomials g in x and p. 


It should be noted that the formulas for the Weyl quantization in Propo- 
sition 13.4 may not give the same “expression” for Qweyi(f) as does 
Definition 13.1, but it does give the same operator. [Compare (13.4).| 
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Proof. Suppose A = (a,X +b; P) and B = (a2X + b2P). Then [A, B] isa 
multiple of J, from which we can easily verify that 


AB) = B* AB —* + k[A, B|B-?, 


for 0 <k < 7. If we sum this relation over k and divide by 7 +1, we obtain 





J 
aaa wD BY ABI-F + a. (13.11) 
= 


Now, A is the Weyl quantization of (a, X +ip) and B? is the Weyl quanti- 
zation of (a2%+ bop)’, and both terms on the right-hand side of (13.11) are 
easily recognized as Weyl quantizations. Thus, after rearranging the terms 
and evaluating the commutator, (13.11) becomes, 

Qweyi((a1z + bip) (agx ie bop)’) 

= Qweyi(a1z =F b1p)Qweyi((a2a =F bop)) 


= ins (arbs — a2b1)Qweyi((aiz + b1p)*~*). (13.12) 


Meanwhile, if we run the same argument starting with B’ A we obtain a 
similar result: 


Qweyi((arx + bip)(a2e + bep)’) 

= Qweyi((a2x ale bop) Qweyi(aia + bip) 

# ins (ars — a2b1)Qweyi((aix + bip)7~?). (13.13) 
If we specialize to the case (a1, b1) = (1,0) and (a2, b2) = (a,b), we get 

Qweyl(@(ax =F bp)) = Qweyi(£) Qwey1((ax =F bp)) 
= MEQ woyi( (ae + bp))-1), (13.14) 

where the last term on the right-hand side of (13.14) is —if/2 times the 
Weyl quantization of 0(ax+ bp)’ /Op. Thus, (13.14) is precisely (13.7) in the 
case g(x, p) = (ax + bp). We can then see from Exercise 1 that (13.7) hold 


for all polynomials g. The proofs of (13.8), (13.9), and (13.10) are similar. 
: 


13.3. The Weyl Quantization for R?” 


In this section, we study the Weyl quantization on a much larger class of 
symbols (i.e., classical observables) than the polynomial symbols considered 
in the previous section. We also generalize from symbols defined on R? to 
symbols defined on R?”. 
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13.3.1 Heuristics 


It is a straightforward matter to extent the Weyl quantization on 
polynomials from R? to R?”. This extended quantization will satisfy 


Qweyi((a-p + b-p)’) = (a-X+b-P)’ (13.15) 


for all a,b € R” and all non-negative integers j, as in Proposition 13.3 in 
the n = 1 case. Suppose we wish to extend Qweyi to certain nonpolynomial 
symbols, starting with complex exponentials. If we multiply (13.15) by 
(i)? /j! and sum on j, we would expect to have 


Qweyl feestee)) Ser), (13.16) 


Now, if f is any sufficiently nice function on R?”, we can expand f as an 
integral involving functions of the form exp(i(a-x +b- p)), by using the 
Fourier transform: 

f(x,p) = (20) " | fla, b)e"@**PP) da db, 
R2n 
where f is the Fourier transform of f. In light of (13.16), it is then natural 
to define 
Qweyi(f) = (2r)-" | fla, b)e***P) da db. (13.17) 
R2n 
Before proceeding, let us pause for a moment to compute the operator 


exp(t(a:-X+b-P)). If A and B are bounded operators that commute with 
their commutator (i.e., such that [A, [A, B]] = [B,[A, B]] = 0), then 


eAtB — eH lA Bl/2eA eB, (13.18) 
(See Theorem 14.1, which is proved in Sect. 3.1 of [21]. Equation (13.18) is 
a special case of the Baker-Campbell—Hausdorff Formula.) If we formally 


apply (13.18) with A = ia-X and B = ib- P (even though these are 
unbounded operators), we obtain 


cia X+b-P) _ pih(a:b)/2,ia-X pib-P_ (13.19) 
Meanwhile, by Example 10.16 in Sect. 10.2, we know that 
(EP Pb) (x) = (x + Ab). 
Thus, we may reasonably hope that 
(ctex+hP)y) (x) = etlab)/2¢iaxy, (x +4 hb). (13.20) 


In general, we get incorrect results if we formally apply results for bounded 
operators to operators that are unbounded. In this case, however, the result 
of the formal calculation is correct. The simplest way to prove this is to 
replace a and b by ta and tb on the right-hand side of (13.19) and to check 
that the result is a strongly continuous one-parameter unitary group. 
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Proposition 13.5 For alla and b in R”, the operators Ua,»(t) on L?(R”) 
given by 
42 
(Ua,v(t))(x) = eit Rab) /2ptta-xy) (x + thb) (13.21) 


form a strongly continuous one-parameter unitary group. The infinitesimal 
generator of this group coincides with a-X+b-P on C%(R") and is 
essentially self-adjoint on this domain. Thus, if a:X+b-P denotes the 
unique self-adjoint extension of the infinitesimal generator on CS°(R"”), it 
follows from Stone’s theorem that 


Z gD ‘ é 
cit(a X+b-P) = eit (a b)/2-ita X pith P 


for allt € R. In particular, (18.19) and (13.20) hold. 


Proof. It is apparent that U,» is unitary for each a and b, and it is a 
simple direct computation to show that it is indeed a unitary group. Strong 
continuity is proved in the usual way using a dense subspace, as in the proof 
of Example 10.12. When w is in CS°(R"”), it is easy to differentiate the right- 
hand side of (13.21) with respect to t at t = 0 to obtain the formula for the 
infinitesimal generator. Finally, the essential self-adjointness of a:X+b-P 
on C%(R”) is precisely the content of Proposition 9.40. m 

With the computation of the operator e’“**+P) in hand, we return to 
our analysis of the proposed formula (13.17) for the general Weyl quan- 
tization. If the Fourier transform of f is in L'(R?"), we can regard the 
right-hand side of (13.17) as an absolutely convergent “Bochner” integral 
with values in the Banach space 6(H). For our purposes, however, it is 
more convenient to think of operators on L?(IR”) as integral operators and 
to write down a formula for the integral kernel of Qweyi(f) in terms of f 
itself. (But see Exercise 7.) 

At a formal level, the operator mapping 7 to e?(@>)/2¢!@*y; (x + hb) 
may be thought of as an “integral” operator, with integral kernel given by 


Fiala lal eal + Ab — y), (13.22) 


where 6, is an n-dimensional delta-function (the n-dimensional analog of 
the distribution in Example A.26). Thus, it should be possible to obtain the 
integral kernel of Qweyi(f) by integrating the preceding expression against 


f(a, b). To evaluate the resulting integral, we make the change of variable 
c = hb, in which case we obtain 


(2rh)-” | i e'(a-B)/2¢1ax5 (x +¢—y)f(a,c/h) de da 
= (2h) | eX —*))/2¢18* Fa, (y — x)/h) da 


= Ao” (Qn)—-7/2 jean f gery? F(a. (y = x)/h) dal. (13.23) 


n 
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We may recognize the integral in square brackets in the last line of (13.23) 
as undoing the Fourier transform of f in the x-variable, leaving us with the 
partial Fourier transform of f in the p variable, evaluated at the points (x+ 
y)/2, (y—x)/h. (The partial Fourier transform means the ordinary Fourier 
transform with respect to one of the variables, with the other variable 
fixed.) Thus, we expect that Qweyi(f) should be the integral operator with 
integral kernel Kf given by 


yoy) =(2mn)" ffx y)/2p)e"™P dp, (13.24) 


13.3.2 The L? Theory 


With the preceding calculations as motivation, we now define Qweyi(f) to 
be the integral operator with kernel Kf, beginning with the case in which 
f belongs to L?(R?"). The resulting operators will turn out to be Hilbert— 
Schmidt operators on L?(R"”). 

If H is a Hilbert space and A € B(H) is a non-negative self-adjoint 
operator on H, then it can be shown that A has a well-defined (but possibly 
infinite) trace. What this means is that the value of 


trace(A) := be (e;, Ae;) 
J 
is the same for each orthonormal basis {e;} of H. Note that since A is a 
non-negative operator, (e;, Ae;) is a non-negative real number, so that the 
sum is always defined, but may have the value +oo. 
Now, if A is any bounded operator, then A*A is self-adjoint and non- 
negative. We say that A is Hilbert-Schmidt if 


trace(A* A) < oo. 


Given two Hilbert-Schmidt operators A and B, it can be shown that A*B 
is a trace-class operator, meaning that the sum 


co 
trace(A*B = v4 (e;, A” Be;) 


j=l 


is absolutely convergent and the value of the sum is independent of the 
choice of orthonormal basis. We define the Hilbert—Schmidt inner product 
of A and B and the associated Hilbert-Schmidt norm of A by 


(A, B) yg := trace(A*B) 
|Allus = Vv trace(A* A). 


It can be shown that the space of Hilbert—Schmidt operators on H forms a 
Hilbert space with respect to the Hilbert-Schmidt inner product. 
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(See Sect. 19.2 for more details.) We denote the space of Hilbert—Schmidt 
operators on H by HS(H). 

We will make use of the following standard (and elementary) result 
characterizing Hilbert-Schmidt operators on L?(R”) in terms of integral 
operators. (See, for example, Theorem VI.23 in Volume I of [34].) 


Proposition 13.6 If « is in L?(IR" x R”) then for every y € L?(R"), the 
integral 


An(w)Q0) = [wey bly) ay (13.25) 


is absolutely convergent for almost every x © R", and A,(wW) also belongs 
to L?(R"). Furthermore, the operator A, is a Hilbert-Schmidt operator on 
L?(R") and 

Axllis = lll pecan xRn) . 


Conversely, for any Hilbert-Schmidt operator A on L?(R"), there exists 
a unique « € L?(R” x R”) such that A= Ax. 


We are now ready, using discussion in Sect. 13.3.1 as motivation, to define 
the Weyl quantization of L? symbols. 


Definition 13.7 For all f € L?(R?”), define mp: R?" + C by 
Kp(x,y) = (20h) -" | f(x +y)/2,p)e"-P/™ dp, (13.26) 
R” 


and define the Weyl quantization of f, as an operator on L?(IR"), by 


Qweyi(f) — Arcs) 
where A,, is defined by (13.25). 


The integral in (13.26) is not necessarily absolutely convergent, and 
should be understood as computing a partial Fourier transform. Thus, we 
should, strictly speaking, replace the right-hand side of (13.26) with 


lim (ona) f f((x+y)/2, p)e*4-~'P/* dp, (13.27) 
IPISR 


R- oo 


where the limit is in the norm topology of L?(R?"). [The partial Fourier 
transform maps the Schwartz space S(R?”) to itself. By Fubini’s theorem 
and the Plancherel formula for R”, the partial Fourier transform is an L?- 
isometry and extends to a unitary map of L?(R?”) to itself. This unitary 
map can be computed by the usual formula on functions in L'M L? and 
can be computed by the limiting formula similar to (13.27) in general.] 

In words, we may describe the procedure for computing Ky at a point 
(x!,x?) in R?” as follows. First, compute the partial Fourier transform Fp 
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of f(x,p) in the p-variable, resulting in the function (F,f)(x,¢). Then 
evaluate Fpf at the point x = (x! + x?)/2, € = (x? — x1) /h. Finally, 
multiply the result by A~"(27)— =n/2 to get 


K F(X’, x’) = h-” (2n)—"/? (Fp f)((x" +x”) /2, (x? —x")\/h). (13.28) 


Theorem 13.8 The map Qwey is a constant multiple of a unitary map 
of L?(R?") onto HS(L?(R")). The inverse map Wey! : HS(L?(R")) > 
L?(R?") is given by 


QWey(A)(x, Pp) = mf «(x — ib/2,x + fib/2)e"”P db, 


where k is the integral kernel of A as in Proposition 13.6. 
Furthermore, for all f € L?(R?”"), we have Qweyi(f) = Qweyi(f)*; in 
particular, Qweyi(f) is self-adjoint if f is real valued. 


Properly speaking, the integral in the theorem should be understood 
as an L” limit, as in (13.27). The fact that Qweyi is unitary (up to a con- 
stant) tells us fat for an appropriate constant c, the operators ce’(@*+P P) 
form an “orthonormal basis in the continuous sense” for the Hilbert space 
HS(L?(IR”)). (Compare Sect. 6.6.) 

It is possible, using the same formulas, to extend the notion of Weyl 
quantization to symbols belonging the space of tempered distributions, 
that is, the space of continuous linear functionals on S(R?"). We will not, 
however, develop this construction here. See [11] for more information. 
Proof. Proposition 13.6 gives a unitary identification of HS(L?(R")) with 
L?(R” x R”). Thus, it suffices to show that the map f +> Ky is a multiple 
of a unitary map. This result holds because the partial Fourier transform 
is a unitary map of L?(R?”) to itself and composition with an invertible 
linear map is a constant multiple of a unitary map. The inverse of the map 
f > Ky is obtained by inverting the linear map and undoing the partial 
Fourier transform. Finally, it is apparent from (13.26) that 


p(X ¥) = Kf(Y, x). 


This, along with Exercise 6, shows that Qweyi(f) = Qweyi(f)*. ™ 


13.3.3 The Composition Formula 


If f and g are L? functions on R?”, then Qweyi(f) and Qweyi(g) are Hilbert— 
Schmidt operators, in which case their product is again Hilbert—Schmidt. 
(Indeed, the product of a Hilbert-Schmidt operator and a bounded operator 
is always Hilbert-Schmidt.) Thus, since Qwey1 is a bijection of L?(R?”) with 
HS(L?(R")), there is a unique L? function, which we denote by f *g, such 
that 

Qwest F)Qweyi(9) = Qwen(f * 9). (13.29) 
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(Of course, the operator x, like the Weyl quantization itself, depends on h, 
but we suppress this dependence in the notation.) 


Proposition 13.9 The Moyal product fxg may be characterized in terms 
of the Fourier transform as 


(f « 9)(a, b) = Cy eae 
x fla—a’,b—b’)9(al,b’) dal db, 
where both integrals are over R”. 


Note that if we set 4 = 0 in the above formula, fxg reduces to (27)~” 
times the convolution of f and g, which is nothing but the Fourier transform 
of fg. It is thus not difficult to show (Exercise 10) that 


li xg=fg. 
Pere dasa 


That is to say, the Moyal product f *g is a “deformation” of the ordinary 
pointwise product of functions on R?”. More generally, the Moyal product 
can be expanded in an asymptotic expansion in powers of h, as explained 
in Sect. 2.3 of [11]. This expansion terminates in the case that f and g are 
both polynomials. 
Proof. It is, of course, possible to obtain this formula using kernel func- 
tions. It is, however, easier to work with the (13.17), which can be shown 
(Exercise 7) to give the same result as Definition 13.7 when f is a Schwartz 
function. We assume standard properties of the Bochner integral for func- 
tions with values in a Banach space [in our case, B(H)], which are similar 
to those of the Lebesgue integral. (See, for example, Sect. V.5 of [46].) 

We have, then, 


Qweyi(f)Qweyi(g) = ony" [f fla, beer?) da db 
. en |f gla’, b/)e(@ +P") dal db’. (13.30) 


Now, it is an easy calculation to verify, using Proposition 13.5, that 


ei(a X+b-P) pi(a’-X+b’-P) = eth(a-b’—b-a’)/2,i((ata’)-X+(b+b!)-P) (i331) 


which is what one obtains by formally applying the special case of the 
Baker—Campbell—Hausdorff formula in (13.18). Thus, we may combine the 
integrals in (13.30) to obtain 


Qwail NQwaila) =n | ff J eMOW—Pad/etlers rer eee) 
x f(a, b)g(a’, b’) da db da’ db’. 
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By introducing new variables c = a+a’ andd=b+b’ in the a and b 
integrals and reversing the order of integration, we obtain, after simplifying 
the exponent, 


Quy f )Qweyi (9 


(20) ae [(2)~ ie —ih(e-b’—d-a’)/2 


x f(c —a',d — b’)g(a’,b’) da’ db’) e(©*+4P) de dd. 


From this and (13.17), we see that Qweyi(f)Qweyi(g) is the Weyl quanti- 
zation of the function whose Fourier transform is the quantity in square 
brackets above, which is what we wanted to show. m 


Proposition 13.10 The Moyal product fxg extends to a continuous map 
of L?(R?") x L?(R2") into L?(R?") and the composition formula (13.29) 
holds for all f and g in L?(R?”). 


Proof. A standard inequality asserts that for any two Hilbert—Schmidt 
operators A and B, we have 


|ABllus < llAllus ll Pllus - 


It follows that the product map (A,B) > AB is a continuous map of 
HS(L7(R")) x HS(L?(R")) to HS(L?(R”)). Meanwhile, the Weyl quantiza- 
tion is a constant multiple of a unitary map from L?(R?") to HS(L?(R")). 
For Schwartz functions f and g, the Moyal product is nothing but 


f «9 = QWeyi(Qweyil f) Qweyi(9)): (13.32) 


The right-hand side of (13.32) provides the desired continuous extension of 
f *g. Clearly, the composition formula (13.29) holds for this extension. m 


13.3.4. Commutation Relations 


In quantum mechanics, the commutator of two operators (divided by ih) 
plays a role similar to that of the Poisson bracket in classical mechanics. 
Thus, we may naturally ask: To what extent does the Weyl quantization 
(or any other quantization scheme) map Poisson brackets to commutators? 
The short answer is: Not always. Indeed, as we will see in Sect. 13.4, no 
“reasonable” quantization scheme can give an exact correspondence be- 
tween {f,g} on the classical side and [A, B]/(ih) on the quantum side. 
Nevertheless, such an exact correspondence does hold for various special 
classes of symbols. If we consider, for example, the class of symbols that 
depend only on x and not on p, then on the classical side, all such functions 
Poisson commute. The Weyl quantization maps such functions f(x) to the 
operator of multiplication by f(x), and thus the quantizations of any two 
such functions commute. A more interesting (in particular, noncommuta- 
tive) example is as follows. 
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Proposition 13.11 Suppose f is a polynomial in x and p of degree at 
most 2 and g is an arbitrary polynomial in x and p. Then 


. [Qweyi(f), Qweyi(9)] = Qweyi({f, gh), (13.33) 


where {f,g} ts the Poisson bracket of f and g. 


Here, we define the Weyl quantization by the obvious n-variable exten- 

sion of Definition 13.1, and we regard all operators as operating simply 
on CS°(IR"). See Exercise 8 for another class of symbols on which (13.33) 
holds. Although the requirement that g be a polynomial can be relaxed, 
we will not attempt to obtain the optimal version of the result. 
Proof. For notational simplicity, we abbreviate Qweyi(f) to Q(f) for the 
duration of the proof. If f has degree zero, then both sides of the desired 
equality are zero. Turning to case in which f has degree 1, we use the n- 
variable extension of Proposition 13.4, the proof of which is essentially the 
same as the 1-variable result. The result is as follows: 


Q(e,0) = Wena - Fo (34) 


= Q(Qe)) + FO (52). 
By subtracting these two formulas and rearranging, we get 


Qe) 0) = (52) = altej.9)) 


A very similar argument establishes the desired result when f = p; and 
thus for all homogeneous polynomials of degree 1. 

Suppose now that f; and f2 are homogeneous polynomials of degree 
1 in x and p. Then it follows easily from Proposition 13.4 that for any 
polynomial h, we have 


Qh) = F(QUHQN) + QA), F= 12. (13.34) 
In particular, we have 


QU fa) = 5(QU) Qf) + QUIQ). (13.35) 


Using (13.35) and the product rule for commutators (Proposition 3.15), we 
have 


(Qh fh), (0) 


= 57 (IQUA). QQ) + AAA). QC9) 
+ [Q(f2), IQs) + Qf)IQA), QC0)). 
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Using the degree-1 case of the result we are trying to prove, along with 
(13.34), we get 


<lQhh), Q(9)] = s(QU fi, gH) Q(fa) + Qf: )Q( fa; g}) 


5(Q 

+ Q({ fe, )Q(fi) + Q(faQ({ fi, gt) 
= Q(fotfi,g}) + Q(fit fa, g}) 

= Q({fife, g}), (13.36) 


where in the last equality we have used the product rule for the Poisson 
bracket. We have now established the desired result when f is a homoge- 
neous polynomial of degree 0, 1, or 2. 

At first glance, it appears that one could extend the result to the case 
where f has degree 3, by considering three homogenous polynomials 1, fo, 
and fz of degree 1 and symmetrizing as in (13.35). The argument breaks 
down, however, because the Q(f;)’s do not commute. The Q(f;)’s will not 
always occur in the correct order to allow us to pull the f;’s back inside the 
Weyl quantization, the way we did in (13.36) in the degree-2 case. Indeed, 
an elementary but tedious calculations shows that 


1 
5 (Qwes(2”p), Qweyi(ap”)] = 3X°P? — GNX P — WT, 


whereas 4 
Qweyi({xp, xp?}) = 3X? P? — 6ihX P — 5d, 


so that the two expressions differ by h?I/2. 

We conclude this section with a brief glimpse of an important “equivari- 
ance” property of the Weyl quantization. Note that the Poisson bracket of 
two real valued homogeneous polynomials of degree 2 is again real valued 
and homogeneous of degree 2. The space of real homogeneous polynomials 
of degree 2 thus forms a Lie algebra (Sect. 16.3) with respect to the Poisson 
bracket. This Lie algebra is naturally isomorphic to the Lie algebra sp(n; R) 
of Lie group Sp(n; R), the real symplectic group. This group is the group of 
invertible linear transformations that preserve a skew-symmetric form on 
R?”, See Chap. 16 for information about Lie groups and their Lie algebras. 

If we apply Proposition 13.11 in the case in which both f and g are 
homogeneous of degree 2, we see that the map m(f) := Qweyi(f) is a repre- 
sentation of sp(n;R) in the space of skew-symmetric operators on L?(R"). 
It can be shown that associated to this representation of sp(n;R) there is 
a projective unitary representation II of the group Sp(n;R), known as the 
metaplectic representation. (See, again, Chap. 16 for definitions.) Proposi- 
tion 13.11 is the infinitesimal version of the following equivariance property 
of the Weyl quantization: For all A € Sp(n;R) and all f € L?(R?”), we 
have 


Qweyi(f 0 A~") = (A) Qweyi(f)H(A)*. 
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See Theorem 2.15 and Chap.4 of [11] [where our II(A) corresponds to 
u((A*)~*) in Folland’s notation] for this result and much more about the 
metaplectic representation. 


13.4. The “No Go” Theorem of Groenewold 


In Sect. 13.3.4, we noted that the Wey! quantization on polynomials satisfies 


= [Qwesi(F); Qwevi(9)] = Qwevi(F.9}): (13.37) 
provided that f is a polynomial of degree 2, but not in general. One might 
think that the failure of (13.37) represents a shortcoming in the definition 
of the Weyl quantization, which could be remedied by an alternative defini- 
tion. In this section, however, we will see that no quantization scheme that 
maps x; and p; to the usual position and momentum operators X; and P; 
can satisfy (13.37) for general polynomials in x and p. This sort of nonex- 
istence result, of a construct satisfying seemingly natural and desirable 
conditions, is referred to in the physics literature as a “no go” theorem. 

In light of this result, one might think that perhaps the position and 
momentum operators should be defined differently, possibly with an ac- 
companying change in the choice of the quantum Hilbert space. Indeed, 
there is a map Q that satisfies (13.37) for all f and g, namely the pre- 
quantization map described in Sect. 23.3. The prequantization map accom- 
plishes this feat by drastically enlarging the quantum Hilbert space, from 
L?(R") to L?(R?"). The Hilbert space L?(R?") is considered to be “too 
big” from a physical standpoint, which explains why the map Q is only 
“prequantization” rather than “quantization.” (The prequantization map 
has a number of other undesirable features that are described in Sect. 23.3.) 
If one imposes a natural “smallness” assumption on the quantum Hilbert 
space (irreducibility under the action of the position and momentum op- 
erators), then the Stone-von Neumann theorem will tell us that (modulo 
certain technical domain assumptions) any choice of position and momen- 
tum operators satisfying the canonical commutation relations is unitarily 
equivalent to the usual ones. 

The upshot of the discussion in the two preceding paragraphs is that 
there is no physically reasonable quantization scheme that satisfies (13.37) 
for all (polynomial) functions f and g. 

We turn, now, to Groenewold’s “no go” theorem. We need to make 
domain assumptions, so that it makes sense to compute the commuta- 
tors of the quantized operators. The simplest approach is to assume that 
the quantization Q(f) of any polynomial f will be in the algebra gener- 
ated by the X’s and P’s, and thus that Q(f) will be a differential operator 
with polynomial coefficients. There is a variant of this result, known as van 
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Hove’s theorem, that proves a similar “no go” result under a more gen- 
eral assumption about the form of the quantized operators. See [15] for a 
rigorous proof of van Hove’s theorem. 


Definition 13.12 For any k > 0, let Py, denote the space of homogeneous 
polynomials of degree k and let P<x, denote the space of all polynomials of 
degree at most k. 


Theorem 13.13 (Groenewold’s Theorem) Let D(R"”) denote the space 
of differential operators on R” with polynomial coefficients. There does not 
exist a linear map Q: P<a + D(R") with the following properties. 


1. QQ) =. 
2. Q(zx;) — XxX; and Q(p;) = Es 


3. For all f and g in P<3, we have 


Q(LE 9) = FQN), QW) (13.38) 


Note that in Property 3 of the theorem, we assume that f and g belong 
to P<3 rather than P<4. This assumption guarantees that {f,g} belongs 
to P<4, so that the left-hand side of (13.38) is defined. 

Our strategy in proving Groenewold’s theorem is the following. We know 
(Proposition 13.11) that the Weyl quantization satisfies (13.38) if f has 
degree at most 2 and g has degree at most 3. Using this result, we can 
show that any map Q satisfying the properties in Theorem 13.13 must 
coincide with the Weyl quantization on P<3. We then identify a polynomial 
f © Pa that can be expressed as a Poisson bracket in two different ways, 
f = {g,h} = {g',h’}, with g, h, g’, and h’ in P3. Upon calculating that 
[(Qweyi(9), Qweyi(h)] does not coincide with [Qweyi(g’), Qweyi(h’)], we will 
have a contradiction. 

The proof will consist of several lemmas, followed by the coup de grace. 


Lemma 13.14 Consider an element A of D(R”) expressed as 
a\* 
A= a f(x) (=) 5 


where k ranges over multi-indices, where the fx’s are polynomials, and 
where only finitely many of the fx’s are nonzero. Then A is the zero oper- 
ator on C’S°(R”) only if each of the fx’s is zero. 


Proof. For each multi-index k, let |k| = k, +---+ k,. Suppose not all 
the fx’s are zero, let N be the smallest non-negative integer for which f, 
is nonzero for some k with |k| = N, and let ko be some multi-index with 
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|ko| = N and fx, 4 0. Let us apply A to a function g that is equal, in a 
neighborhood of the origin, to x*°. Then all the terms in Ag other than 
the f,, term will be zero in a neighborhood of the origin, whereas the fx, 
term will be a nonzero constant in a neighborhood of the origin. Thus, A 
is not the zero operator. 


Lemma 13.15 If A belongs to D(R") and A commutes with X; and P; 
for all j =1,...,n, then A=cl for some c€ C. 


Proof. We may easily prove by induction that 


(2) eae =*( 2) ae +5 (2) oe 


for any polynomial g. Thus, for any multi-index k, we have 


Li (Z) | = kj f(x) (x) as (13.39) 


Suppose A is a nonzero element of D(R”) that commutes with each X;. 
If deg(A) = M, consider a nonzero term in A of degree M: 


a 
fiat) (Fe) + Tol = Mi fo # 0 


If M > 0, we can pick some j such that the jth entry of ko is nonzero. 
By (13.39) and our assumption on A, we have 


ra) ko—e; 
O= [A, X;] = (ko); feo (x) (=) + other terms, 
x 
where the other terms involve multi-indices of the form k—e,, with k £ ko. 
Thus, by Lemma 13.14, [A, X;] is not the zero operator. 

We see, then, that any A € D(R”) that commutes with each X,; must be 
of degree zero; that is, A must simply be multiplication by some polynomial 
f(x). If, in addition, A commutes with each P;, then 

., Of 
Thus, actually, f must be constant and A is a multiple of the identity 
operator. @ 


Lemma 13.16 For any f € Po, there exist g1,...,9; and hi,...,hj in Po 
such that 

f= {gr ha} +++ + 193, hj}- 
Furthermore, for any f' € P3, there exist elements gj,...,9;,0f P3 and 
hi,..., hj, of Pz such that 
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Proof. See Exercise 12. m 


Lemma 13.17 If Q satisfies the conditions in Theorem 13.13, then Q 
coincides with Qwey! on P<3. 


Proof. Our argument leans heavily on Proposition 13.11. Note that, by 
assumption, @ coincides with Qwey) on P<. For f € Pe, let us write 


Q(f) as 
Q(f) = Qweyilf) + Af. 


For any g € P<i, we have, by (13.38) and Proposition 13.11, 


Q(LE 9) = FIQ(N), Q(0) 


= J [Qwevi(F), Qweyi(9)] + qlAr, Qweri(9) 


in 
Qwevi({F.9}) + IAP, Qwen(9)] 


= OL F.9}) + FAs, Qwewi(9)] (13.40) 


I 


since {f,g} € P<. Thus, [Ar, Qweyi(g)] = 0 for every g € Pi, and so, by 
Lemma 13.15, we must have Ay = c¢J for some constant cy. 

Now, if h is in Py, we have, by the just-established result and Proposi- 
tion 13.11, 


Q(LE AY) = FIQU/), Q(0)] 
= = [Qwevi(/) + cpl, Qweyi(h) + ent] 


= F[Qweyi(/); Ques) 
= Qwen (LF. h}) (13.41) 


That is to say, Q and Qwey! agree on elements of P2 of the form { f,h}, for 
f,h © Pz. Thus, by Lemma 13.16, @ and Qweyi agree on all of Pz, and so 
on all of P<o. 

We now use the P<2 case of the lemma to establish the P3 case. Given f € 
P3, we write Q(f) = Qweyi(f) + By. Given g € P<i, we have { f,g} € Peo. 
Thus, we may argue as in (13.40), applying the just-established P<2 case of 
the lemma to {f,g} in the last step. The conclusion is that [By, Q(g)] =0 
for all f € P<g and thus, by Lemma 13.15, that By = dyI for some constant 
dy. Meanwhile, if h € P2, we argue as in (13.41), but with cy replaced by 
dy and with cy, now known to be zero. The conclusion is that Q agrees with 
Qweyi for all elements of P3 of the form {f,h} with f € P3 and h € Pa, 
and thus, by Lemma 13.16, for all elements of P3. 
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Proof of Theorem 13.13. Assume, toward a contradiction, that a map Q 
as in the theorem exists. Let f be the polynomial given by 


f(x,p) = x7. 
We observe that f can be written in two different ways as a Poisson bracket: 
go Lee 3 1, 2 2 
MP, = gti Pil = gle1P1 t1Pi}- 


Thus, by Lemma 13.17, we must have 
1 ; 
g Qweyi(@i), Qweyi(pt)] = #hQ (a7) 


= jlQwenle}rr), Qwenlere?) 


On the other hand, if we apply both commutators to the constant func- 
tion 1 (or to a function equal to 1 in a neighborhood of the origin), we 
obtain 


1 1 
5 (Qweyi(2?), Qwen(PH)]L = 5 (XPPE — PPXP)A 
1 
= —~(—ih)°6-1. 
9 


Meanwhile, if we compute the quantizations as in (13.4) and then drop all 
terms involving P,1, we obtain (after a small computation) 


12 

1 
= Tp MPP xT + P?X1P,X?)1 
1 
~~ 12 


1 
= ——(-ih)34-1. 
7a (Ah) 


Since 6/9 does not equal 4/12, we have a contradiction. ™ 


1 1 
3 Qweyi(@7P1), Qweyi(w1p7)]1 = (XP PPX + PLXPPPX)1 


PPX PX? 1 


13.5 Exercises 


1. Let P; denote the space of complex-valued homogeneous polynomials 
on R? of degree j. Then P; is a complex vector space of dimension 
j+1, which we may identify with C/T! using the obvious basis for P;. 
Let V; denote the complex subspace of P; spanned by polynomials 
of the form (ax + bp)’, with a,b € C. Show that V; = P;. 

Hint: Since every subspace of Ct! is (topologically) closed, if y(t) is 
a smooth curve in V;, the derivative +'(t) will also lie in Vj. 
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. Show that symmetrized pseudodifferential operator quantization of 


x*p? is equal to Qweyi(x7p) — h?/2. 


. Show that Wick-ordered and anti-Wick-ordered quantizations map 


real-valued polynomials to symmetric operators on Co°(R). 


k 


Hint: Compare the values of each quantization scheme on z*z! and 


on (z*2z!). 


. Consider a classical harmonic oscillator with Hamiltonian 


where w is the frequency of the oscillator. Consider the Wick- and 
anti-Wick-ordered quantizations with parameter a = 1/(mw). Show 
that 


OQwiek( 2) = Qweyil( A) — 5 lw 
Qanti—Wick(H) = Qweyi(H) + sh. 


. Let Ua»(t) be as in Proposition 13.5. Show by direct calculation that 


these operators form a one-parameter unitary group. 


. Given « € L?(R" xR"), let A, denote the associated integral operator 


on L?(R”), as in Proposition 13.6. Show that the adjoint A* of A is 
also an integral operator, with integral kernel «’ given by 





K(x, y) = K(y, x). 


. Suppose that f € L?(R2") and that f € L!(R2"). Then the right- 


hand side of (13.17) may be understood as an absolutely convergent 
“Bochner” integral with values in the Banach space B(L?(IR")). Show 
that Qweyi(f) as defined by (13.17) coincides with Qweyi(f) as de- 
fined in Definition 13.7. 


Hint: The Bochner integral commutes with applying a bounded lin- 
ear functional. Use this result with the linear functional Ag (A) := 
(¢, AW) on B(L?(R")). Then use the expression in (13.23) for K, 
which follows from Definition 13.7 by applying a partial Fourier trans- 
form. 


(a) Show that for any polynomial f in one variable, we have 


Qweni((o)p) = oP ~ 2px), 


10. 


11. 


12. 
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(b) Show that for any two polynomials f and g, the Poisson bracket 
{f(x)p, g(x)p} is of the form h(a)p for some polynomial h. 


(c) Show that for any two polynomials f and g, we have 


= [Qweyi(f(@)P); Qweyi(9(@)P)] = Qweyi({ f(«)p, 9()P}). 


(a) Given ¢ and w in L?(R"”), let |¢)(w| be the operator defined in 
Notation 3.28. Show that |é)(q| can be expressed as an integral 
operator as in Proposition 13.6 and determine the associated 
integral kernel x. 


For o > 0, let WY, € L?(R") be given by the expression 


— 
J 


bo (x) = (wo) */4eP/20), 


Using Proposition A.22, show that w, is a unit vector in L?(R”) 
and that the Weyl symbol of the corresponding one-dimensional 
projection operator |w,)(w,| is given by 


Qh yilleXWal) = "en PP /e eel? /P? 


Note: If we give o the value i/(mw), the Gaussian function w, may 
be thought of as the ground state for an n-dimensional harmonic os- 
cillator. (Compare the functions in Theorem 11.3.) The computation 
in this exercise plays an important role in the proof of the Stone—von 
Neumann theorem in Chap. 14.8. 


If f and g are Schwartz functions on R?”, show that fxg converges 
in the L' norm to (27)~" f *g, where « denotes convolution. Conclude 
that f * g converges uniformly to fg as fh tends to zero. 


Suppose that f(p,q) is a homogeneous polynomial of degree 2. Show 
that for each t, the Hamiltonian flow ®; associated with f is a linear 
map of R?” to itself. 


Prove Lemma 13.16. 


Hint: Let gi € P2 be given by 


Show that for any monomial of the form xjp*, we have {g;,xJp*} = 
(|k| — |j])xJp*. Thus, most of the standard basis elements f for P2 
and all of the standard basis elements f for P3 can be obtained as 
nonzero multiples of {g1, f}. 


14 


The Stone-von Neumann Theorem 


The Stone-von Neumann theorem is a uniqueness theorem for operators 
satisfying the canonical commutation relations. Suppose A and B are two 
self-adjoint operators on H satisfying [A, B] = ihI. Suppose also that A 
and B act irreducibly on H, meaning that the only closed subspaces of 
H invariant under A and B are {0} and H. Then provided that certain 
technical assumptions hold (the exponentiated commutation relations), we 
will conclude that A and B are unitarily equivalent to the usual position 
and momentum operators X and P. That is, there is a unitary operator 
U : H > L?(R) such that UAU-! = X and UBU~! = P. If H is not 
irreducible, then it decomposes as a direct sum of invariant subspaces V; 
for A and B, and the restrictions of A and B to each VY are unitarily 
equivalent to the usual X and P. 

We begin this chapter with a heuristic argument for the Stone-von Neu- 
mann theorem, an argument that glosses over certain (essential but tech- 
nical) domain issues. Then we introduce the exponentiated commutation 
relations, which should be thought of as a sort of mild strengthening of 
the ordinary canonical commutation relations. Finally, we give a precise 
statement of the theorem and provide a proof. 


14.1 A Heuristic Argument 


Suppose that A and B are any two (possibly unbounded) self-adjoint op- 
erators on a separable Hilbert space H satisfying [A, B] = iAI. What we 
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would lke to conclude is that H looks like a Hilbert space direct sum of 
closed subspaces V; that are invariant under A and B, and such that each 
V, is unitarily equivalent to L?(IR) in a way that turns the operators A and 
B into the standard position and momentum operators X and P. That is 
to say, we hope to find unitary maps U; : V; + L?(R) such that 


UjAU;' =X 
Ue =e 


This conclusion is, however, not quite correct, for reasons having to do 
with the domains of the relevant operators. Nevertheless, let us consider 
a heuristic argument for this conclusion. We start by forming a lowering 
operator a and a raising operator a* by analogy to the definitions of a and 
a* in Chap. 11: 


_— mwA+iB | « . moA—iB 


= ——|; v= ——. 
V 2hmw J/2hmw 


Then we look at the kernel W of the lowering operator a, which will be a 
closed subspace of H, provided that a is a closed operator. The elements 
of W may be thought of as “ground states” for the operator a*a. Choose 
an orthonormal basis {¢1,} for W and define vectors 

Om = (a*)™ 4p. 
It is not hard to show that for 1 4 I’, ¢!, is orthogonal to by for all m and 
m’. Let V; denote the closed span of the vectors w!,, m = 0,1,2,.... 

Using the calculation in Sect. 11.2, we can see that the way a and a®* act 
on each chain (the vectors 7! with J fixed and m varying) is precisely the 
same as the way the standard lowering and raising operators a and a* act 
on the chain of eigenvectors for a*a. Thus, for each 1, we can construct a 
unitary map U; from V; to L?(R) by mapping the vectors ¢/,, in V; to the 
vectors Wm in L?(R) described in Theorems 11.3 and 11.4. (In particular, 
the vector qo € L?(R) is the ground state for the harmonic oscillator, which 
is a Gaussian.) Since the formula for how a and a* act is the same as the 
formula for how a and a* act, U; will “intertwine” @ with a and a* with 
a and a*, meaning that Uja = aU), and similarly for a* and a*. It follows 
that U, also intertwines A with X and B with P. 

It remains only to argue (heuristically) that the spaces V, fill up the whole 
Hilbert space H. Clearly, the span V of the V;’s is invariant under both 
a and a*. Thus, the orthogonal complement V+ of V is invariant under 
the adjoints a* and a. If V+ is not zero, then arguing as in Chap. 11, 
there should be a ground state in V+, that is a nonzero vector annihilated 
by a. This vector would be orthogonal to all the ¢4’s, contradicting the 
assumption that the h’s form an orthonormal basis for the kernel of a. 

The preceding heuristic argument cannot be completely rigorous, how- 
ever, since the counterexample in Sect. 12.2 gives a pair of operators A 
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and B that satisfy the canonical commutation relations but are clearly not 
unitarily equivalent to the usual position and momentum operators. After 
all, the “position” operator A in that section is a bounded operator, which 
cannot be unitarily equivalent to the usual position operator. 

What goes wrong is, as usual, a matter of domain considerations. Setting 
m, h, and w equal to 1, we can look for a vector ¢g that is annihilated by 


the operator 
1 1 d 
= —(A+iB) = — —}. 
a Ja! + iB) a5; («+ =) 


By the same argument as in Chap. 11, ¢9 must be a constant multiple of the 
function e~*’/2, The function ¢1 := abe | is then a multiple of ze~* /?. The 
problem is that ¢; is not in the domain of a*. After all, @, does not satisfy 
the periodic boundary condition ¢(—1) = ~(1) that defines the domain 
of B. Thus, we cannot continue to apply a* to obtain an orthogonal chain 
of vectors and the entire argument breaks down. 

What we need, then, is some additional condition that will distinguish 
between the “good” cases of the canonical commutation relations and the 
“bad” cases. One possibility for this additional condition is the exponen- 
tiated form of the canonical commutation relations, which are discussed 
in the following section. Our rigorous proof (Sect. 14.3) of the Stone-von 
Neumann theorem will follow the same outline as the heuristic argument 
in this section, except that the unbounded operators @ and a* will be re- 
placed by certain bounded operators, constructed by an analog of the Weyl 
quantization. 


14.2. The Exponentiated Commutation Relations 


If A is a bounded operator on a Hilbert space H, we may define the expo- 
nential of A, denoted either e4 or exp(A), by the power series 


i ge 
m=0 m! 
where A° = J. A standard power series argument shows that if A,B € 
B(H) commute, then 
eAtB — eAeB [A,B] =0. (14.1) 


(See Exercise 6 in Chap. 16.) Even when A and B do not commute, there 
is a formula, called the Baker—Campbell—Hausdorff formula, that expresses 
e4e®, for sufficiently small A and B, in the form 


Ac ep {A+B +5 ~[A, B] + +A al+}, 
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where the terms indicated by --- are iterated commutators involving A 
and B. (See Chap. 3 of [21] for more information.) A very special case of 
this formula is obtained in the case where A and B commute with their 
commutator, so that all higher commutators are zero. 


Theorem 14.1 Suppose A,B € B(H) commute with their commutator, 
that is, 
[A, [A, B]] = [B, [A, B]] = 0. 


Then 


iil 
ete = cATBTHIABI 


This relation may also be written as 


A+B 


et 
e =e 2IAB) AGB 


Note that in this special case of the Baker-Campbell—Hausdorff formula, 
no smallness assumption is imposed on A and B. 
Proof. We will prove that 


+2 
etAetB — et(AtB)+ F1AB] (14.2) 


which reduces to the desired result at ¢ = 1. Since [A, B] commutes with 
everything in sight, we can use (14.1) to split the exponential on the right- 
hand side of (14.2) into two and then move the factor involving [A, B] to 
the other side. Thus, (14.2) is equivalent to the relation 


etAetBe-VlA.Bl/2 _ ot(At+B). (14.3) 


Let a(t) denote the left-hand side of (14.3). We will show that a(t) satisfies 


a simple differential equation, which may be solved explicitly to obtain 
a(t) = (At), 


Using term-by-term differentiation, it is easy to verify that 

d 

dt 

for any C € B(H), and that 


elf = Cet’ — eC 


d 2 
a [A,B] /2 — eV IABI/2(_ 4,4, B)). 


We may then differentiate a(t) using the product rule, which is proved the 
same way as in the scalar case, giving 


a — tA AetB eV [A.B] /2 de etAetB Be #14.) /2 


as ett et B eV 1A.B]/2(_ 4A, B)). 
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To simplify our expression for da/dt, we need an intermediate result. By 
the product rule 


Se Pall =e A, Ble =A, BI, (14.4) 


because B—and, thus, e? —commutes with [A, B]. Noting that e~'? Ae’? = 
A when t = 0, we may integrate (14.4) to get 


e*® Ae“? — A+i/A, BI. (14.5) 


(The difference of the two sides of (14.5) has derivative zero, so by Part (a) 
of Exercise 2, the two sides are equal up to a constant, which is seen to be 
zero by evaluating at t = 0.) 

Using (14.5), we obtain 


ef Ae’® = et AetB (e—tB Aet®) — e'4e'8 (A +t[A, B)). 
Moreover, since everything commutes with [A, B], we may commute any- 
thing we want past e~!4,81/2, Thus, 


da 


= a(t(A + HA, B] + B- #14, B)) 


= a(t)(A +B). 





Now, according to Exercise 2, the unique solution to the differential equa- 
tion da/dt = a(t)(A+ B) is a(t) = a(0)e(4+). Since a(0) = I, we obtain 
the desired result (14.3). 
Suppose, now, that A and B are unbounded self-adjoint operators satis- 
fying 
[A, B] = 2A, (14.6) 


where the exponentials e’*4 and e’’? are defined by means of the spectral 
theorem. If we formally apply Theorem 14.1 to isA and itB (even these 
operators are unbounded), we obtain 


ei(sAt+tB) = cisth/2pisA pitB = ec isth/2 pit B pisA 


so that 
cisAitB = e isth pitB pisA (14.7) 
It is essential to emphasize that the conclusion (14.7) is only formal, since 
it assumes that results for bounded operators carry over to unbounded 
operators, which is false in general. Nevertheless, we may hope that in 
“good” cases, self-adjoint operators satisfying (14.6) will also satisfy (14.7). 
Extending the preceding discussion to the case of several degrees of free- 
dom in an obvious way, we are led to the following definition. 
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Definition 14.2 If A,,..., An and By,..., By are possibly unbounded self- 
adjoint operators on H, the A’s and B’s satisfy the exponentiated com- 
mutation relations if the following relations hold for all 1 < j,k <n and 
s,tER: 

eisAs citar a citar ei8As 

8 Bj et Br _ et Be pis B; 

ess eit Br _ e tsthd jx citBr e845. 
The operators e’*47 and e*+?« are defined by the spectral theorem for un- 
bounded self-adjoint operators, and they are unitary operators, defined on 
all of H. Thus, when we say that the exponentiated commutation relations 
hold, we mean that they hold on the entire Hilbert space H. 


Notation 14.3 Suppose operators A,,...,An and By,..., Bn satisfy the 
exponentiated commutation relations. Then for all a and b in R”, let 
e(@A+b-B) denote the unitary operator given by 


eil(aA+b-B) = eilt(ab)/2¢ia1 Ai Seay eidnAn eibi Bi sta elon Bn | (14.8) 


Equation (14.8) is nothing but what we obtain by formally applying 
Theorem 14.1 to the operators ia- A and 7b- B and then further splitting 
the exponentials by formally applying (14.1). The notation may be further 
justified by checking (Exercise 4) that the operators 


eat) = eit R(a-b)/2itar Ar _ » eitanAn pith: Bi . . eithn Bn (14.9) 


form a strongly continuous one-parameter unitary group. If we then de- 
fine a-A+bD-B as the infinitesimal generator (Sect. 10.2) of Uap, the 
relation (14.8) will indeed hold. Using the definition of e“@4+») and the 
exponentiated commutation relations, a simple calculation shows that 


ei(aAt+b-B) pi(a’-A+b’-B) ev ih(a-b!—b-a')/2,i((ata’)-A+(b+b’)-B) (14.10) 


In particular, e~*(#4+>B) is the inverse of e(@4+>B), as the notation 
suggests. 

The following examples show that in the good case (the usual position 
and momentum operators on L?(R”)), the exponentiated commutation re- 
lations do hold, where as in the bad case (the counterexample in Sect. 12.2), 
they do not. 


Example 14.4 Let A; be the usual position operator X; acting on L?(R") 
and let B; be the usual momentum operator P;. Then the A’s and B’s 
satisfy the exponentiated commutation relations. 


Proof. Since X; is just multiplication by x;, it is easily verified that ery 
is just multiplication by e’°*7. Meanwhile, the exponentiated momentum 
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operators satisfy (Example 10.16) 
(e" db) (x) = V(x + the;). 


It is then evident that e’**7 commutes with e’'** and that e’*”i commutes 
with e”?*, We may also compute that 


(cP ety) (xe) = etn ah(xe + ther) 


= eisthdjr (gees eit Pi w) (x) 
which is what we wanted to prove. 


Example 14.5 Let A be the operator in Sect. 12.2 and let B be the (unique 
self-adjoint extension of) the operator in that section. Then A and B do 
not satisfy the exponentiated commutation relations. 


Proof. The operator A is multiplication by x, and so the operator e’*4 


is just multiplication by e’**. Meanwhile, the operator B is —ih d/dz, 
with periodic boundary conditions. We will now demonstrate that e’? 
consists of “translation with wraparound.” Specifically, for any a € R and 
w € L?({-1,1)), let us define Sw € L?({[-1,1]) by 


(Sap)(x) = W(x + a — 2mz,a), 
where m, is the unique integer such that 
-l<a@+a-—2mz¢ <1. 


It is easy to check that S, is a unitary map of L?({0,1]) for each a € R. 
We then claim that 
er = Sie (14.11) 


To verify the correctness of (14.11), observe that B has an orthonormal 
basis of eigenvectors, namely the functions ~,(a) := e™’"”, n € Z, with the 
corresponding eigenvalues being mnh. Thus, if we compute e’”? by means 
of the spectral theorem, we have 


et By a= gry 
n — n° 
On the other hand, 


(Sathn)(a)(e7™") = etm (E+A-2ma) 


—27rinm Tina ,TInxL 
=e wate e€ 


= ey. (x) ‘ 


showing that e*'B and Sr: agree on each of the functions w,, n € Z, and 
thus on all of L?({—1, 1). 
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Having computed both e’*4 and e“?, we may now easily see that these 
operators do not satisfy the exponentiated commutation relations. We have, 
for example, that 

eA eitBy — est 


? 


whereas 
eitB pisAy = eis(@tth—2mMe,a) 


The function e#(¢+#?—2me,2) ig not equal to e**e*** but rather to 


eisth pist o—2ismMa,a 


where e~?’8™<.2 is not always equal to 1. m 


14.3. The Theorem 


We give two versions of the Stone-von Neumann theorem, one for general 
operators satisfying the exponentiated commutation relations and one for 
the special case where the operators act irreducibly. 


Definition 14.6 Operators Ay,...,A, and B,,..., By satisfying the ex- 
ponentiated commutation relations are said to act trreducibly on H if the 
only closed subspaces of H that are invariant under every e'*4i and every 


e'®i are {0} and H. 


Proposition 14.7 The usual position and momentum operators act irre- 


ducibly on L?(R"). 
We delay the proof of this result until near the end of this section. 


Theorem 14.8 (Stone-von Neumann Theorem) Suppose Aj,...,An 
and By,...,By are self-adjoint operators on H. satisfying the exponentiated 
commutation relations. Then H can be decomposed as an orthogonal direct 
sum of closed subspaces {V;} with the following properties. First, each V, is 
invariant under e*4i and e®i for all j and t. Second, there exist unitary 
operators U; : Vi + L?(R”) such that 


U; ett; U; = eit Xj 


and 
U;et?i Ue = eit P; 


for all j and t. 
If, in addition, the A’s and B’s act irreducibly on H, then there exists a 
single unitary map U :H — L?(R”) such that 


Ueit4s U7} 22 eit; 
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and 

Uctt®i U7} — eit P; 
for allt. The map U is unique up to multiplication by a constant of absolute 
value 1. 


The preceding results can be expressed in terms of the Heisenberg group; 
see Exercise 6. 

Our strategy (as in von Neumann’s 1931 paper [41]) in proving Theo- 
rem 14.8 is to follow the outline of the heuristic argument in Sect. 14.1, but 
replacing the unbounded raising and lowering operators by the bounded 
operators e(4+>-B) in Notation 14.3. If we define do € L?(R”) by 


go(x) = (na)~"/4e— 7/22) (14.12) 


for some 0 > 0, then ¢o is a unit vector, which we may think of as the 
ground state of an n-dimensional harmonic oscillator with frequency w = 
h/(mo). We can easily compute the Weyl symbol of the projection |¢o)(¢o| 
onto ¢o as follows: 


fo(x, p) := QwWeyi(lGo)(Gol) = 9% lk? /e ele? /A? (14.13) 


(See Exercise 9 in Chap. 13). 

We may define a generalized Weyl quantization Q for H by using the op- 
erators e'(#4+>-B) in place of the operators e(@%+P) in (13.17). We will 
show that the operator P := Q(fo) is an orthogonal projection, and we will 
take W := Range(P) as our space of ground states in H. A crucial result 
will be that the projection P is nonzero and, indeed, that the restriction 
of P to any nonzero subspace invariant under the e’(@4+-B)’s is nonzero. 

If {w'} is an orthonormal basis for W, consider the vectors 

i — gir Are Bi)! 

We will show that these vectors are orthogonal for different values of 1, 
and that for fixed J, the inner product of two such vectors is the same 
as in the L?(R") case. Thus, if V; denotes the closed span of the wi ,,’s 
with / fixed and a and b varying, we can construct a unitary map from 
V; to L?(R”) that intertwines the operators e’@4+B) with the operators 
e(@X+b-P) The sum of the V;’s must be all of H, for if not, the orthogonal 
complement Y of the span would be invariant under the e(#4+'8)’s, Thus, 
the restriction of P to Y would be nonzero, implying that there are elements 
of W := Range(P) orthogonal to every ~', contradicting the assumption 
that the w’’s span W. 

The rest of this section will flesh out the argument sketched in the pre- 
ceding paragraphs. 
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Definition 14.9 Suppose self-adjoint operators A,,..., An and By,..., Br 
satisfy the exponentiated commutation relations on H. For any f € S(R?"), 
define Q(f) € B(H) by the formula 
Q(f) =(2m)" f fla, b)e"*44®) da db, 
R2n 

where Eg is the Fourier transform of f and where e(#At+» 8B) is as in 
Notation 14.3. The integral is a Bochner integral with values in the Ba- 
nach space B(H). 


We will assume the following standard properties of the Bochner integral 
(Sect. V.5 of [46]). First, any continuous function f : R?” > B(H) for which 
J \|f(x)|| da < oo has a well-defined Bochner integral. Second, the Bochner 
integral commutes with applying bounded linear transformations. Third, a 
version of Fubini’s theorem holds. 


Proposition 14.10 For any operators satisfying the exponentiated com- 
mutation relations, the associated map Q in Definition 14.9 has the follow- 
ing properties. 


1. If f € S(R") is real valued, Q(f) is self-adjoint. 
2. For alla and b in R” and f € S(R"), we have 
oA PQA) = QU") 
Q( fel AtP®) = Q(f"), 
where f’ and f” are the functions with Fourier transforms given by 
fi(al,b’) = eile be)? Fa! — a,b! — b) 
f(a’, b’) = eh b-ab’)/2 F(a! _ ab! — b) 
3. For all f and g in S(R?2"), we have 
Q(f)Q(9) = Qf *g), 
where x is the Moyal product described in Proposition 13.9. 
4. For all f € S(R"), if Q(f) =0 then f =0. 


Using both parts of Point 2 of the theorem, we can see that for all 
a,b € R”, we have 


e@AtbB)Q/fyei(aA+bB) _ Q(g), 


where 7 
g(a’, b’) = e*(@"b-2b) Fa’ b’). (14.14) 
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Proof. For Point 1, we can re-express Q(f) as 
ly; ; Pe 
(20)~” i 5 fa, Bet Ate B) . f(a, —pje eA?) da db, 
R2n 


since the change of variable a’ = —a, b’ = —b brings the second term 
equal to the first term. If f is real valued, then f(—a,—b) is the conjugate 
of 7 (a, b), so that the expression in square brackets in the integral is self- 
adjoint for each (a, b). 

For the first part of Point 2, we use (14.10) to obtain 


Foams 8G 9) 
= (2n)-" | ge be)/2 Fal peters ATES PB) aa! ap’, 
R2n 
Making the change of variables a” = a’ +a and b” = b’+b and simplifying 
gives the desired result. The proof of the second part of Point 2 is similar. 
The proof of Point 3 is precisely the same as the proof of Proposition 13.9, 
which relies only on the exponentiated commutation relations. 


For Point 4, suppose that Q(f) = 0 for some f € S(R?"). Then for all 
¢,W € H and all a,b € R”, we have 


is Ca Q(frele Ate By) 
_ (6. ere OG ae et ey) 


= (6, Q(9)¥) 
where g is as in (14.14). Thus, 


0= peer iely) (6, alice) da!’ db’ (14.15) 


for all ¢,W and a,b. But (14.15) is just computing the inverse Fourier 
transform of the function f(a’, b’)(¢, e(@ 4+» B)y), evaluated at the point 
(—a,b). By the Fourier inversion formula, then, this function must be zero 
for almost every pair (a’,b’). Now, the function (¢, e@ At" B)y) is a 
continuous function of (a,b) and by taking ¢ = e*@oA+bo-B)y, it can be 
made to be nonzero at any given point (a9, bo) in R?", and thus also in 
a neighborhood of that point. Thus, actually, f is identically zero and so 
also is f. m 


Lemma 14.11 Let fo be the function on R?” given by 
fo(x,p) = 2"e Pl /eenelPh iP 
where o is a fixed positive number. Then for all a,b € R”, we have 


Orr Oa) = gear er iPr/ Gear ey. (14.16) 
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In particular, 


Q(fo)* = Q(fo)- 
Proof. By Proposition 14.10, (14.16) is equivalent to the assertion that 


* af = e~eleP/4 eh iB? /Ge) 5. (14.17) 


Now, it is certainly possible to establish (14.17) by direct computation from 
the definitions of fj and x; all the integrals involved will be Gaussian inte- 
grals, which can be evaluated by means of Proposition A.22. This approach, 
however, is both painful and unilluminating. A more sensible approach is 
to observe that is suffices to verify (14.16) for the ordinary Wey] quantiza- 
tion on L?(R”). After all, (14.16) is equivalent to (14.17), which in turn is 
equivalent to the identity 


Qweyi(fo)e@*tP) Qweyi(fo) 
= 7 7lal?/4e—PIbI7/40) Quresi(fo)s (14.18) 


by applying Proposition 14.10 in the case Q = Qweyi. 
Now, by Exercise 9 in Chap. 13, Qweyi(fo) is the one-dimensional pro- 
jection |¢o)(¢o| , where ¢o(x) = (wa)~"/4e7*/22), Thus, 


Qweyi(fo)e"@ At” ®) Qweyi(fo) = |doXdo| ef **P) |doX do! 
= c|doX¢ol ; (14.19) 


where 
c= (do| eX *t PP) [go) . 


To compute c, we use (13.20), which gives 
c= (na) v/reimen f e7lxl?/ (20) piaxe—|x+hb|? /(20) dx. (14.20) 


The integral in (14.20) can be computed by expanding |x + Ab]” , collecting 
terms in the exponent, and applying Proposition A.22. The result, after a 
bit of algebra, is 


_ 2 = 2 
c= ew 7lal?/4e—hlb I? /(40)_ 


which gives (14.18). m 

We now prove the claimed irreducibility of the usual position and mo- 
mentum operators. 
Proof of Proposition 14.7. Given operators A1,..., Ap and Bj,..., By 
satisfying the exponentiated commutation relations, consider the operator 
Q(fo), where fo is as in (14.13). According to Lemma 14.11, Q(fo)? = 
Q(fo). Since also fo is real valued, Q(fo) is self-adjoint and thus an orthog- 
onal projection. Suppose that the range of the orthogonal projection Q(fo) 
is one-dimensional. We then claim that the A’s and B’s act irreducibly. If 
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not, there would exist a nontrivial closed subspace V that is invariant un- 
der each of the operators e(#4+»'B) Then the nonzero subspace V+ would 
also be invariant under each of the operators (e(@4+»B))* — e~taA+b-B), 
Thus, the exponentiated commutation relations are satisfied in both V and 
V+, with the A’s and B’s being the infinitesimal generators of the restric- 
tions of e4i and e®; to each subspace. 

It follows that the restriction of Q(fo) to each of these subspaces may be 
thought of as the generalized Weyl quantizations for V and V+ of the func- 
tion fo. Applying Point 4 of Proposition 14.10 to V and to V+, we conclude 
that the restrictions of Q(fo) to V and to V+ are nonzero. Thus, both V 
and V+ will contain nonzero elements of Range(Q(fo)), contradicting our 
assumption that Range(Q(fo)) is one dimensional. 

In case of L*(IR"), we have Qweyi(fo) = |¢o¢o], where ¢o is given 
by (14.12), which clearly has a one-dimensional range. Thus, the usual 
position and momentum operators act irreducibly on L?(R”). 

We are finally ready for the proof of the Stone-von Neumann theorem. 
Proof of Theorem 14.8. Let W = Range(Q(fo)), where fo is given 
by (14.13) for some fixed o > 0. For ¢,W € W, we can use (14.10), 
Lemma 14.11, and the fact that Q(fo) is the identity on W to obtain 


Cea alia) 
= (QA fo)d, ee AHP B)elle!-A+b'B)Q( fo ys) 


_ pih(a-b!—b-a’)/2 (4, A fo)et(e!—2) AHE'—P)B)Q( fo y 
_ pit(a-b!—b-a’)/2,~o|a’—al?/4—h? |b’ —b|? /(4o) (o, 0) . (14.21) 





Now let {7!} be an orthonormal basis for W and define vectors Wp» 
a,b € R”, by 
y! b= ge Ath B) pt 


a, 


By (14.21), wep is orthogonal to ei whenever | 4 l’. Furthermore, 





. , , t 2 n v) 
(Has Ye) — eit(ab ba )/26 ola al /46 h?|b b| piae) (14.22) 


where the right-hand side of (14.22) is “universal,” that is, independent of 
1 and independent of the particular Hilbert space in which we are working. 

Let V; be the closed span of the vectors w ,, with | fixed and a, b varying. 
We may define a map U; : V, > L?(R") by requiring that 


N N 
l a3 
U; y Oj Va,;,b; =) Qj Paj,b;) 
j=l j=l 


for every sequence a,,...,ay and bj,...,by of vectors, where 


da b= grr El da. 
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This map is isometric by (14.22) on linear combinations of the pe ’8 and 
thus extends uniquely to an isometric map of V; into L?(R”). [In particular, 
U; is well defined: If some linear combination of we p's is zero, then this 
linear combination has norm zero and so its image under U; also has norm 
zero and is thus zero in L?(R”),] 

Now, V; is invariant under the operators e’(@4+'B) py (14.10), and, simi- 
larly, the image of V; under U, is invariant under the operators e’(@*%+P'P), 
By the irreducibility of L?(R") (Proposition 14.7), we conclude that V; 
maps onto L?(R") and is, therefore, unitary. Furthermore, using (14.10) and 
the analogous expression (13.31) for the position and momentum operators, 
it is easy to check that each U; intertwines e(@4+>®) with e(@A+tPB) for 
all a,b € R”. In particular, taking either a = te; and b = 0 or a= 0 and 
b = te; we see that U; intertwines e”’47 with e”*>, Similarly, U; intertwines 
en") with es, 

We now argue that the Hilbert space direct sum of the orthogonal sub- 
spaces V; is all of H. If not, then as in the proof of Proposition 14.7, the 
orthogonal complement Y of this sum would be invariant under the oper- 
ators e(@4+-B) and thus also under the operator Q(fo). Furthermore, as 
in the proof of Proposition 14.7, the restriction of Q(fo) to Y would be 
nonzero. Thus, there would exist elements of W = Range(Q(fo)) orthogo- 
nal to each ~w!, contradicting the assumption that the q!’s span W. 

It remains only to address the irreducible case. If the A’s and B’s act 
irreducibly, then there can be only one subspace, V; = H, which means 
that W must be one dimensional. Any unitary map U : H > L?(R"”) that 
intertwines each operator e(@4+>-B) with e(@%+>P) must also intertwine 
each operator of the form Q(f) with Qweyi(f). It follows that U must map 
the one-dimensional subspace W unitarily onto the one-dimensional range 
of Qweyi(fo) = |¢o0) (¢0|. Thus, the restriction of U to W is unique up to a 
constant of absolute value 1. But the reasoning leading to the existence of 
U shows that U is determined by its action on W, so the entire map U is 
unique up to a constant. 


14.4 The Segal-Bargmann Space 


A simple example of the Stone-von Neumann theorem is provided by the 
Hilbert space H := L?(R”), together with the operators A; := P;, and 
B; := —X;,. In that case (Exercise 3), the unitary map U in the Stone-von 
Neumann theorem will simply be a scaled version of the Fourier transform, 
as in Definition 6.1. To obtain a more interesting example, we construct a 
Hilbert space consisting of holomorphic functions on C”. 
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14.4.1. The Raising and Lowering Operators 


A smooth function on F : C” -+ C is said to be holomorphic if it is 
holomorphic as a function of z; with the other z;’s fixed. Equivalently, F 
is holomorphic if 0F'/0z; = 0, where 


a _1/2 a 
Oz; 2 da; ' Oy; )” 


@ 1/8 48 
Oz; a 2 Ox; Oy; 


preserves the space of holomorphic functions on C”. 

Considered the operators z; (i.e., multiplication by z;) and h 0/0z;, 
acting on the space of holomorphic functions on C”. Fock [9] observed that 
these operators satisfy the following commutation relations: 





The operator 





0 0 
[24 2k] = [age sng = 
0 


These are essentially the same commutation relations as the raising and 
lowering operators considered in Sect. 11.2. Specifically, (14.23) are the re- 
lations that would be satisfied by the natural higher-dimensional analogs 
of the operators a and a* in that section if we omitted the factor of Vh in 
the denominator in (11.4) and (11.5). 

Now, if we wish to interpret the operators z; and h 0/0z; as raising and 
lowering operators, then we should look for an inner product on the space 
of holomorphic functions that would make these two operators adjoints 
of each other. After all, the analysis in Chap. 11 strongly depends on the 
assumption that a and a* are adjoints of each other. In the early 1960s, 
Segal [36] and Bargmann [2] identified such an inner product. Once we have 
described this Segal-Bargmann inner product, we will construct self-adjoint 
“position” and “momentum” operators as appropriate linear combinations 
of z; and h 0/0z;. We will then verify the exponentiated commutation 
relations and irreducibility, allowing us to apply the Stone-von Neumann 
theorem. 

We look for an L? inner product with respect to a measure having a 
positive density with respect to the Lebesgue measure on C”. 


Lemma 14.12 Suppose that p is a smooth, strictly positive density on C” 
and that F and G are sufficiently nice (but not necessarily holomorphic) 
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functions on C”. Then 





ce J 
== | Sette) te [PB Ge) ae, (1424) 
cn OZ; n OZ; 


where dz denotes the 2n-dimensional Lebesgue measure on C” = R?”. 


Equation (14.24) tells us that 


a\" A Alog us 

(= = 02; OZ; , 
where the adjoint is computed with respect to the inner product for the 
Hilbert space L?(C”, 2). If we restrict the adjoint operator (0/0z;)* to 
the space of holomorphic functions, then the 0/0Z; term is zero, by the 
definition of a holomorphic function. 
Proof. Let us approximate the integral over C” on the left-hand side 
of (14.24) by an integral over a large cube. By performing either the ,- 
integral or the y;-integral first, we can integrate by parts to push the deriva- 
tives with respect to x; or y; off of G and onto the product of F and pu 
(with a minus sign). The boundary term in the integration by parts will 
involve the function F'(z)G(z)u(z) integrated over two opposite faces of 
the cube. If this function tends to zero sufficiently rapidly at infinity, the 
boundary terms will vanish in the limit. In that case, we obtain 








F(z)—p 
| POs. 


=~ (geF@) Genta) ae f FajG(e) 5 ae, 


(z) dz 


provided that all three of the above integrals are absolutely convergent. 
Since OF /0z; = OF /0Z; and 








we obtain (14.24). m 

We now look for a density uy, for which Olog u/0zZ; = —z;/h. In that 
case, the adjoint operator (0/0z;)* preserves the holomorphic subspace of 
L?(C”, yn) and is given on this subspace by multiplication by z;/h. 


Lemma 14.13 Specialize Lemma 14.12 to the case in which F and G are 
holomorphic polynomials and s is the density up given by 


igs ae (14.25) 
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Then we have 





u ; F@) 5 nn dz = = | 25 F (2)G(2)un(2) de. (14.26) 


Proof. In the case that F and G are holomorphic polynomials, 0F'/0Z; = 0, 
so the first term on the right-hand side of (14.24) is zero. Furthermore, FG 
decreases rapidly at infinity and so the boundary terms vanish in this case. 
Finally, we may compute 0 log p,/0Z; as —z;/h, giving (14.26). m 


Definition 14.14 The Segal-Bargmann space, denoted HL?(C”, jun) is 
the space of holomorphic functions F on C” for which 


I= (fh Le? wate in) 20, 


where pin, is as in (14.25). Define raising and lowering operators a; and 
a; on HL?(C™, un) by 


* . 
a; = &} 


ae Fe 
“5 Oz; : 

with the domain of aj and aj consisting of the space of holomorphic poly- 

nomials. 


In light of Lemma 14.13, the operators a; and ay satisfy 


F, 5G) 41 12(Cr pn) a (a5F, Oita 


for all holomorphic polynomials F' and G, thus justifying the notation aj 
for the raising operator. The space HL?(C", jz_) is also sometimes called 
the Fock space. It should be noted, however, that in quantum field the- 
ory, the term Fock space also refers to a different (but related) space—the 
completion of the tensor algebra over a fixed Hilbert space. 


Proposition 14.15 The Segal-Bargmann space is complete with respect 
to the norm ||-||;,, and forms a Hilbert space with respect to the associated 
inner product, 


(F.G)_= | F@G(@)un(2) de. 
cr 
Furthermore, the space of holomorphic polynomials forms a dense subspace 
of the Segal-Bargmann space. 


Note that elements of HL?(C”, ip) are actual functions on C”, not equiv- 
alence classes of functions. Nevertheless, we can regard HL?(C", fun) as a 
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subspace of L?(C”, jzn), since each equivalence class of almost-everywhere 
equal functions contains at most one holomorphic representative. 
Proof. Given any z € C” and R > 0, let P,,,r denote the polydisk given 
by 

var ={zeE C™| |z; = (Zo) ;| <p = Vicia tt} 


Using a power-series argument, it is easy to show that the value of a holo- 
morphic function F’ at zo is equal to the average of F' over P,,,r. We can 
then multiply and divide by pj, to obtain 


1 1 
F(z) = Gua vO ae [in(z) dz. 


The Cauchy—Schwarz inequality then tells us that 


|F’(zo)| 


1 1 
s (7 R2)” (230, aa) |S Seea| aerr eae IF ll n2¢cn un) ° (14.27) 


This inequality tells us that pointwise evaluation [the map F +> F(zo)] is 
a bounded linear functional on the Segal-Bargmann space. 

Suppose now that F;, is a sequence of holomorphic functions such that 
F,, converges in L?(C”, up) to some F. Using (14.27), we can easily show 
that F,, converges to F' uniformly on compact sets, which implies that F is 
also holomorphic. This shows that the holomorphic subspace of L?(C”, jun) 
is closed and hence is a Hilbert space. 

To show the denseness of polynomials, consider some F € HL?(C", up) 
and let 


F(z) = 0 an2” (14.28) 


be the Taylor expansion of F’, where n ranges over all multi-indices. This 
series converges to F' uniformly on compact subsets of C”. We claim that 
the terms in (14.28) are orthogonal. To see this, use Fubini’s theorem to 
perform the integration of z™ against z™ one variable at a time. Using 
polar coordinates in each copy of C, we can see that we will get zero if the 
power of z; in z” is not the same as the power of z; in z™. 

Since it is orthogonal, the series in (14.28) will converge in L?(C”, wn) 
provided that the sum of the squares of the norms of the terms is finite. If 
Por is a sequence of polydisks of increasing radius centered at the origin, 
the argument in the preceding paragraph shows that the terms in (14.28) 
are orthogonal in L?(Pp,r, fin). Since the series converges uniformly on Pp, pr, 
we can then interchange sum and integral to obtain 


2 2 2 
Do lanl” 2 Mhn2(r een) = WF llc2¢P, est) * 
n 
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By applying monotone convergence to both the sum over n and the integrals 
over Por, we may let R tend to infinity to obtain 


2 2 2 
do lanl” [2 I72(C yup) = NF llz2(Cr un) < 2 
n 


Thus, the series in (14.28) converges in L?(C”, yy) and this L? limit must 
coincide with the pointwise limit, namely F' itself. m 


14.4.2 The Exponentiated Commutation Relations 


To apply the Stone-von Neumann theorem to the Segal-Bargmann space, 
we define self-adjoint “position” and “momentum” operators as follows: 


1 a) 
Ama lattgs) 

i 0 
2175 (5-5): 


We will identify one-parameter unitary groups having (extensions of) these 
operators as their infinitesimal generators, which will show (by Stone’s 
theorem) that the generators are indeed self-adjoint on suitable domains. 
We will then verify the exponentiated commutation relations and check 
irreducibility. 

Let us compute heuristically and then check that our results are correct. 
If we formally apply Theorem 14.1 to the (unbounded) operators )7 a;z; 
and —h >> a;0/0z;, we obtain 


1 ‘ a 
= exp {shal exp — $0 a;z; exp hd 455 : (14.29) 


This calculation suggests that we define operators Ta by the formula 
(TaF)(z) = e~"al’/2e-8 2 F(z 4 fia), ae C”, (14.30) 


where for any a,b € C”, we define a-b = }7, ajb; (no complex conjugates). 
Since the exponent on the left-hand side of (14.29) is skew-self-adjoint (the 
difference of an operator and its adjoint), we expect the operators T, to 
be unitary. For suitable choices of a, the operator on the left-hand side 
of (14.29) will become the one-parameter group generated by A; or B;. 
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Theorem 14.16 For each a € C", the operator T, defined by (14.30) is 
a unitary operator on the Segal-Bargmann space, and the map ar> Ty, is 
strongly continuous. These operators satisfy 


TT, = OT a, (14.31) 
In particular, for each j, the maps 


U;(t) = Tite, / VI V3 (t) = Tre, /V2 


are strongly continuous one-parameter unitary groups. The infinitesimal 
generators A; and B; of these groups satisfy the exponentiated commutation 
relations. 

For any F € Dom(A,), we have 


(A;F)(z) = 5 («iF@) + ne) 


and for any F € Dom(B;), we have 


(B,F)(2) = S («iF@) a ns) 


Furthermore, the domains of A; and B; contain all holomorphic polyno- 
mials. 

Finally, the operators A; and B; act irreducibly on the Segal-Bargmann 
space, in the sense of Definition 14.6. 


Proof. It is evident that T, F(z) is holomorphic as a function of z for each 
fixed a. Meanwhile, for any F € HL?(C", jun), we have 


ZaF lize (cn puny = (HR) | eWPlal’ e~?2Re(®2) | F(z + hia)? e'/* da 
= cay” f eWl=thal?/h F(z + fa)|? dz 


2 
a IF lln2(ce un)» 


showing that 7, is isometric. The formula for T,7, follows from direct 
computation (Exercise 7), and from this formula we see that T,T_~ = I, 
which shows that T, is surjective and thus unitary. The strong continuity 
of Ta is easily verified on polynomials (Exercise 8), which are dense in the 
HL? (Ce, Ltn). 

It easily follows from (14.31) that U;(-) and V;(-) are one-parameter uni- 
tary groups, and also that (the infinitesimal generators of) these unitary 
groups satisfy the exponentiated commutation relations. If F is in the do- 
main of the infinitesimal generator of U;(-), the limit 


(A;F)(2) == + lim : [ent /Mettes/V2 F(a + ithe;/V/2) — F(z)| (14.32) 


1 t30 t 
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must exist in L?(C”, uy). The L? limit coincides with the easily computed 
pointwise limit, giving 


Aj F(2) 1 ih | . 


1 


as claimed. If F' is a polynomial, it is easily shown, using dominated con- 
vergence, that the limit in (14.32) exists in L?(C”, wn). The analysis of B; 
is similar. 

Finally, we address irreducibility. If the A;’s and B;’s did not act ir- 
reducibly, then in the application of the Stone-von Neumann theorem to 
HL?(C", x), there would exist at least two subspaces V;. Thus, there would 
exist at least two linearly independent vectors F; such that for all 7, we have 
that F; is in the domain of A; and B,; and 


(Take F, to be the preimage under U, of the function do in (14.12), with o = 
h.) This would mean that each F) is constant, contradicting the assumption 
that the F;’s are linearly independent. 


14.4.8 The Reproducing Kernel 


According to (14.27), evaluation of F € HL?(C", p,) at a fixed point z is 
a continuous linear functional. Thus, this linear functional can be written 
as the inner product with a unique element yz, of HL?(C”, un), which we 
now compute. The vector x, is called the coherent state with parameter z. 


Proposition 14.17 For all F © HL?(C", wn), we have 
F(z) =| e* ¥/" F(w) un(w) dw. (14.33) 


The function e?'¥/" is called the reproducing kernel for HL?(C”, un), 
since integration against this kernel simply gives back (or “reproduces” ) 
the function F. Of course, the relation (14.33) holds only for holomorphic 
functions in L?(C”, wy). Equation (14.33) can be rewritten as 


F(z) = (Xe, F)a412(C pun)? 


where 
Zw/h 


Xz(w) =e 
Proof. We begin by establishing the result in the case z = 0. We have 
already established, in the proof of Proposition 14.15, that the Taylor series 
of F converges to F in HL?(C", un), and the distinct monomials in this 
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series are orthogonal. Thus, when computing (1, F),, 12( only the 
constant term in the expansion of F’ survives, giving 


(1, Fay r2¢cn un) = FO) (1, Uagr2cenjun) = FO), (14.34) 


since 47, is a probability measure. But this relation is precisely the z = 0 
case of (14.33). 

Let us now apply (14.34) to T,F, where T, is the unitary operator 
in (14.30). According to Theorem 14.16, T, is unitary with inverse equal 
to T_a, giving 


C™ pn)? 


(Taf) (0) = (1, Ta) 4412(C7 un) = (Tal, F)qr2(cn jun) 


Writing this relation out using w as our variable of integration gives 
eMal’/2 (ha) = [eRe Po un(w) dw. 


Setting a = z/h and simplifying gives the desired result. 


14.4.4. The Segal-Bargmann Transform 


Since the operators A; and B; in Theorem 14.16 satisfy the exponentiated 
commutation relations and act irreducibly on HL?(C”, jun), the second part 
of the Stone-von Neumann theorem tells us that there is a unitary map 
U :HL?(C", up) > L?(R”), unique up to a constant, that intertwines these 
operator with the usual position and momentum operators. The inverse 
map V : L?(R") > HL?(C", pp) is called the Segal-Bargmann transform. 


Theorem 14.18 Let V be the inverse of the map U : HL?(C", un) > 
L?(R") given by the Stone-von Neumann theorem, normalized so that V 
takes the function do € L?(R”) in (14.12) (with o = h) to the constant 
function 1 € HL?(C", pn). Then V may be computed as follows: 


(uy(a) = (any [exp {se (+2 2VBu- +x: x) bul) dx 


Recall that we define a- b = i a,b; for all a,b € C”, with no complex 
conjugates in the definition. In particular, the integrand in the formula for 
Vw is a holomorphic function of z, for each fixed x. 

Note that the value of (Vw)(z) at z = 0 is simply the inner product of 
with the ground state function ¢9, with o = h. The proof of Theorem 14.18 
will show that the value of (V~j)(z) at an arbitrary z is a certain constant 
cz, times the inner product of ~ with a phase space translate of ¢o, that is, 
a vector of the form e’®*e'®P gg. [See (14.36).] According to (the obvious 
higher-dimensional counterpart to) Proposition 12.11, 9 is a minimum un- 
certainty state, meaning that equality is achieved in Corollary 12.9 for each 
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j. Thus, by (the obvious higher-dimensional counterpart to) Exercise 3 in 
Chap. 12, each state of the form e’®’*e'»P gp is also a minimum uncertainty 
state. 
Proof. By the unitarity of V and the z = 0 case of Proposition 14.17, we 
have 


($0, ¥) r2(R») = (Vo, Vib)a412(C» pn) = (1, Vb) a402(C un) = (Vy)(0). 
Thus, the value of Vw at 0 is just the inner product of w with ¢9. More 
generally, 


(eX e-P gw) = (do, cP Pei Xy) 
= (Vo, Veh Pela Xy\ 

= (1,e? Bei® Aya) 

= (e> Bel AV) (0), (14.35) 
ia A 


where e€ means the product (in any order) of the operators e’*)4 


similarly for e’>B. 

Recall that A;’s and B,’s are defined as the infinitesimal generators 
of the groups U; and V; in Theorem 14.16, which in turn are defined in 
terms of the operators Tg. If we use (14.31) to compute the right-hand side 
of (14.35), we obtain 


(ee AVW)(0) = (Ty yaTias aV¥)(0) 
= PT ia v3V¥)(0) 
= ethab/2_—A(lal+1bI")/4(Vah)(A(b + ia)/V2). 
Thus, if we apply (14.35) with a = V2yo/h and b = V2x0/h, we obtain 
eT, v) 


J, and 


= gixo-yo/lig=(xol?-+ly0l?)/C2M) (¥%)(x9 + iyo). (14.36) 
Solving (14.36) for (Vw)(xo + iyo) gives 
(Vw) (xo + iyo) = (wh) ~P/4e—#0°¥0/h o(lxol? +1y0l?)/(2R) 


2 ) eV 2y0-%/Be— be VI%0|"/ 2M aoe) dec, 


which simplifies to the claimed formula for Vw. 


14.5 Exercises 


1. Show that if operators A and B satisfy the exponentiated commu- 
tation relations of Sect. 14.2, they satisfy the “semi-exponentiated” 
commutation relations, that is, the hypotheses of Theorem 12.8. 
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Hint: For any a,s € R and w € Dom(A), rearrange the expression 


era (ee) _ (eB wp) 


Ss 





using the exponentiated commutation relations. Then let s tend to 
zero and apply Stone’s theorem. 


2. (a) Suppose a: R > B(H) is a differentiable map, meaning that 
im a(t +h) — a(t) 
h-0 h 


exists in the norm topology of B(H) for each t. Show that if 
da/dt = 0 for all t, then a is constant. 


(b) Suppose a: R > B(H) is a differentiable map such that 
da 
— =a(t)A 
eo) 
for some fixed A € B(H). Show that a(t) = a(0)e!4 for all t. 

3. Show that the operators A; := P; and B; := —X; on L?(R”) sat- 
isfy the exponentiated commutation relations. Determine the unitary 
operator U : L?(R”) > L?(R") (unique up to a constant) such that 

UeitAi U7! = et Xj 


Veit U7! _ elt Pi : 


4. Verify that the operators Uap(t) in (14.9) form a strongly continuous 
one-parameter unitary group. 


5. In this exercise, we develop a discrete version of (the n = 1 case of) 
the Stone-von Neumann theorem. Let p be a prime number, let Z/p 
denote the field of integers modulo p, and let h be a nonzero ele- 
ment of Z/p. Consider the finite-dimensional Hilbert space L?(Z/p), 
taken with respect to the counting measure on Z/p. Let U denote the 
“modulation” operator 


(Uf)(n) = 2"! F(n) 
and let V denote the “translation” operator on L?(Z/p), given by 
(Vf)(n) = f(n +h). 


In the case of the modulation operator, note that the expression 
e27n/P descends unambiguously from n € Z to n € Z/p. 
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(a) Verify that U? = V? = J and that, for all / and m in Z, 


(b) 


ulym = eo 2rilm/pymyzl 


Suppose now that A and B are unitary operators on a finite- 
dimensional Hilbert space H satisfying A? = B? = I and 


AlB™ — e 2tilm/p By 


Suppose also that the only subspaces of H invariant under both 
A and B are {0} and H. Show that there is a unitary map W 
from H to L?(Z/p) such that 


WAW-*=U 

WBW-'=V. 
Hint: Show that if v € H is an eigenvector for A, then so is 
B'v for any 1. Show that each eigenspace for A has dimension 1 


and identify the associated eigenvectors with the “d-functions” 
in L7(Z/p). 


6. Given a constant u € C with |u| = 1 and a pair of vectors a,b € R”, 
let Uu.a,b be the unitary operator on L?(R”) given by 


(a) 
(b) 


(Uua,bW)(X) = ue’ * h(x + hb). 


Verify that the set of operators of this form a group under the 
operation of composition, and denote this group by Hy. 


Let H,, denote the set of (n +2) x (n +2) matrices of the form 


1 a Gn C 
1 by 
A= : ; 
1 bd, 
1 
with a1,...,@, and b;,...,b, in R. (The only nonzero entries 


in A are on the main diagonal, in the first row, and in the last 
column.) Verify that H,, forms a group under matrix multipli- 
cation. Show that there is a surjective group homomorphism 
®: Ay — H,, with discrete kernel. 

Hint: Compare the formulas for group multiplication in H), 
and Hy. 


Note: In the language of Chap. 16, H,, is the universal covering group 
of H,. The group Hp, is called the Heisenberg group. 
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7. Show by direct computation that the operators Ty in (14.30) satisfy 
the relations (14.31). 


8. Using dominated convergence, show that for every holomorphic poly- 
nomial F' on C”, we have 


. 2. 
lim ||TaF — ToF|lZ2(cm, un) = 9 


where Ty, is as in (14.30). 


15 
The WKB Approximation 


15.1 Introduction 


The WKB method, named for Gregor Wentzel, Hendrik Kramers, and Léon 
Brillouin, gives an approximation to the eigenfunctions and eigenvalues of 
the Hamiltonian operator A in one dimension. The approximation is best 
understood as applying to a fixed range of energies as h tends to zero. (It 
is also reasonable in many cases to think of the approximation as applying 
to a fixed value of fi as the energy tends to infinity.) 

The idea of the WKB approximation is that the potential function V (2) 
can be thought of as being “slowly varying,” with the result that solutions 
to the time-independent Schrodinger equation will look locally like the so- 
lutions in the case of a constant potential. In the classically allowed region, 
this line of thinking will yield an approximation consisting of a rapidly os- 
cillating complex exponential multiplied by a slowly varying amplitude. We 
make the “local frequency” of the exponential equal to what it would be if 
V were constant. Having made this choice, there is a unique choice for the 
amplitude that yields an error that is of order h?. This amplitude, however, 
tends to infinity as we approach the “turning points,” that is, the points 
where the classical particle changes directions. Similarly, in the classically 
forbidden region, we obtain approximate solutions that are rapidly grow- 
ing or decaying exponentials, multiplied by a slowly varying factor. Again, 
there is a unique choice for the slowly varying factor that gives errors of 
order h?, and again, this factor blows up at the turning points. 
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The difficulty near the turning points means that we cannot directly 
“match” the approximate solutions in different regimes the way we did in 
Chap. 5. Instead, we will use the Airy function to approximate the solution 
to the Schrédinger equation near the turning points. Asymptotics of the 
Airy function will then yield the appropriate matching condition, which 
turns out to be a corrected form of the Bohr-Sommerfeld rule that appears 
in the “old” quantum theory. 


15.2. The Old Quantum Theory and the 
Bohr—Sommerfeld Condition 


The old quantum theory, developed by Bohr, Sommerfeld, and de Broglie, 
among others, may be pictured as follows. Consider, for simplicity, a par- 
ticle with one degree of freedom, and let C' be a level set in phase space of 
the Hamiltonian, 


C = {(2,p) € R?| H(a,p) = E}, (15.1) 


which we assume to be a closed curve. We now imagine drawing a “wave” 
on C, that is, some oscillatory function defined over C. Following the de 
Broglie hypothesis (Sect. 1.2.2), we postulate that the local frequency k of 
the wave as a function of x is p/h. This means that the phase of our wave 
should be obtained by integrating the 1-form 


1 
7P dx (15.2) 


along the curve. Thus, the wave itself can be pictured as a function on C' 


of the form ie 
cos | p dx — 5) ; (15.3) 
xO 


where Zo is some arbitrary starting point on the curve C and where 6 is an 
arbitrary phase. Note that the old quantum theory did not offer a physical 
interpretation of this wave; it was simply a crude attempt to introduce 
waves into the picture. 

The Bohr—Sommerfeld condition is simply the requirement that the func- 
tion in (15.3) should match up with itself when we go all the way around 
the curve. This will happen precisely if 


h Cc 


for some integer n. The energy levels in the old quantum theory were taken 
to be those numbers F for which the corresponding level curve C’ sat- 
isfies the Bohr-Sommerfeld condition (15.4). Although Bohr-Sommerfeld 
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quantization had some successes, notably explaining the energy levels of 
the hydrogen atom, it ultimately failed to correctly predict the energies of 
complex systems. 

For systems with one degree of freedom, a vestige of the Bohr-Sommerfeld 
approach survives in modern quantum theory, with two modifications. 
First, the condition (15.4) has to be corrected by replacing the n by n+1/2 
on the right-hand side of (15.4). (The replacement of n by n+1/2 is known 
as the Maslov correction.) Second, this condition does not (in most cases) 
give the exact energy levels, but only the leading-order semiclassical ap- 
proximation to the energy levels. The preceding discussion leads to the 
following definition. 


Condition 15.1 A number E is said to satisfy the Maslov-corrected Bohr- 
Sommerfeld condition if 


a dz = 27(n + 1/2) (15.5) 


for some integer n, where C is the classical energy curve in (15.1). In light 
of Green’s theorem, this condition may be rewritten as 


1 


1 
5p Area enclosed by C) =n+ -. 


2 


When the Maslov correction is included, the Bohr-Sommerfeld condition 
can be stated as saying that the wave with phase given by integrating the 
1-form in (15.2) should be 180° out of phase with itself after one trip around 
the energy curve. Figure 15.1 shows an example, which should be contrasted 
with Fig. 1.3. (Note also that Fig. 1.3 is drawn in the configuration space, 
whereas Fig. 15.1 is in the phase space.) 

In our analysis in the subsequent sections, we will see that the Maslov 
correction—that is, the extra 1/2 in (15.5), as compared to (15.4)—actually 
consists of a contribution of 1/4 from each of the two “turning points” of 
the classical particle. (The turning points are the points where the classical 
particle changes directions.) Specifically, in the WKB approximation, the 
phase of the wave function will be computed as the integral of (p dx)/h 
along one “branch” of the classical energy curve C. Using the Airy function 
to approximate the wave function near the turning points, we will obtain 
an “extra” 2/4 of phase between each turning point and the last local 
maximum or minimum of the wave function. Because of the two branches 
of C, the extra 7/4 of phase near each of the two turning points actually 
contributes an extra 7 to the integral on the left-hand side of (15.5). 

The reader may wonder why there is no comparable correction term 
in our discussion of the Bohr-de Broglie model of the hydrogen atom in 
Sect. 1.2.2. One way to answer this question is as follows. As we will see in 
Sect. 18.1, the Schrédinger operator for the hydrogen atom can be reduced 
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P 








FIGURE 15.1. A trajectory satisfying the corrected Bohr-Sommerfeld condition 
with n = 10. 


to a one-dimensional Schrodinger operator with an effective potential of the 
form 


2 2 
Veatr) = 2 _Wul+d) 


r 2mr2 





Here / is a non-negative integer that labels the “total angular momentum” 
of the wave function. At least when / > 0, one can analyze this Schrédinger 
operator using a WKB-type analysis very similar to the one in the current 
chapter, with one important modification: The radial wave function [the 
quantity h(r) in (18.5)] must be zero at r = 0 in order for the wave function 
to be in the domain of the Hamiltonian. 

If one analyzes the situation carefully, it turns out that the zero boundary 
condition at r = 0 introduces another correction into the Bohr-Sommerfeld 
condition in the amount of 1/2. There is still also a correction of 1/4 for 
each of the two turning points, leading to the condition 


; | dx = 2 dae = 2n(n+ 1) 
ilo pdzr=In\(nt+otgts5] =2a(n : 
Since n + 1 is again an integer, we are effectively back to the uncorrected 


Bohr-—Sommerfeld condition. See Chap. 11 of [8] for a discussion of different 
approaches to the WKB approximation for radial potentials. 


15.3 Classical and Semiclassical Approximations 


We are interested in finding approximate solutions to the time-independent 
Schrodinger equation, 


— 5S + (V(e) - Ela) = 0 (15.6) 
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for small values of A. Ultimately, we will need to analyze the behavior of 
solutions in three different regions, the classically allowed region [points 
where V(x) < E], the classically forbidden region (points where V(x) > 
£), and the region near the “turning points,” that is, the points where 
V(a) = E. 

Let us consider at first the classically allowed region. Given a potential 
V and an energy level E, we can solve (up to a choice of sign) for the 
momentum of a classical particle as a function of position as 


p(2) = /2m(E — V(2)). 


We look for approximate solutions 7 to (15.6) of the form 





p(a) = A(z)e*S@/%, (15.7) 


where S satisfies S’(a) = p(x). Note that we are taking the phase of our 
wave function to be 





phase = a fr) dx, 


as in the old quantum theory in Sect. 15.2. The “amplitude function” A() 
will be chosen to be independent of fi and thus “slowly varying” (for small /) 
compared to the exponent S(x)/h. 

Our first, elementary, result is that for any number F for which there is 
a classically allowed region and for any reasonable choice of the amplitude 
A(a) in (15.7), we obtain an approximate eigenvector solution to the time- 
independent Schrodinger equation, with an error term of order h. 


Proposition 15.2 For any two numbers Fy and E> with E,>infyer V(2), 
there exists a constant C and a nonzero function A € CS°(R) with the 
following property. For every E © [E1, E2], the support of A is contained 
in the classically allowed region at energy E’ and the function w given by 





ie SAS {47 / vie) ac} 


satisfies 


Hy — Edl| < CAlly. (15.8) 


Proof. For any FE’ € [EF , F2], the classically allowed region for energy F 
contains the classically allowed region for energy /,. We choose, then, A to 
be any nonzero element of C°°(R) with support in the classically allowed 
region for energy F,. If we evaluate A w — Ew by direct calculation, there 
will a term in which two derivatives fall on the exponential factor, bringing 
down a factor involving p(x)”. The definition of p(x) is such that the term 
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involving p(x)? will cancel the term involving V(x) — E, leaving us with 








Hennes _ (4) + +2A'(x)p(2) - 7H (2)Ale)) 


aes {43 fr) ac} . (15.9) 


(Here, each occurrence of the symbol + has the same value, either all pluses 
or all minuses.) Thus, 








. h? h 
| 1b — Ey|| < <—||A"|| + 5—ll24’p + Ap’. (15.10) 
2m 2m 


Since ||~|| is independent of h, the right-hand side of (15.10) is of order 
Ai||w|| . It is easy to check that ||2A’p+ Ap’|| is bounded as a function of F 
for any £ in the range [E1, £2] and the result follows. m= 

Proposition 15.2, along with elementary spectral theory, tells us that for 
any E larger than the minimum of V, there is a point E in the spectrum 
of H such that 


|E— E| < ch. (15.11) 


(See Exercise 4 in Chap. 10.) If we assume that V(x) tends to +00 as 
a —> too, then AH will have discrete spectrum and we can say that E is 
an eigenvalue for H. The conclusion, for such potentials, is this: Given any 
number E € [F1, Eo], there is an eigenvalue of H within Ch of E. Thus, as 
h tends to zero, the eigenvalues of H “fill up” the entire range of values of 
the classical energy function. 

Proposition 15.2 is one manifestation of the “classical limit” of quantum 
mechanics: the quantum energy spectrum is, in a certain sense, approxi- 
mating the classical energy spectrum as h gets small. Notice, however, that 
this result tells us only that the eigenvalues are at most order fh apart and 
nothing further about the location of the individual eigenvalues. 

In this chapter, we will show that if EF satisfies the corrected Bohr-— 
Sommerfeld condition, then there exists an eigenvalue E of Hl such that 





JE—E| < cn/®. (15.12) 


An estimate of the form (15.12) locates eigenvalues with an error bound 
that is small compared to the expected average spacing between the eigen- 
values, which is of order i. On the other hand, the approximate energy 
levels E are determined by Condition 15.1, which is a condition on the 
classical energy curve. Thus, (15.12) can be described as a semiclassi- 
cal estimate: It is estimating quantum mechanical quantities (the indi- 
vidual energy levels) in classical terms (the level curves of the classical 
Hamiltonian). 
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15.4 The WKB Approximation Away 
from the Turning Points 


We consider only the simplest interesting case of the WKB approximation, 
in which the following assumption holds. See the book of Miller [30] for 
much about this sort of asymptotic analysis. 


Assumption 15.3 Consider a smooth, real-valued potential V(x), with 
V(a) + +00 as «© — too. Assume that the functions V'(x)/V(x) and 
V"(«)/V(x) are bounded for x near co. 

Consider also a range of energies of the form Ey < E < Eg. Assume 
that for each E in this range, there are exactly two points, a(E) and b(E), 
with a(E) < b(E), for which V(a) = E. Further assume that the derivative 
of V is nonzero at a(E) and b(E), for all FE € [Ey, E2]. 








See Fig. 15.2 for a typical example. Since V is locally bounded and tends 
to +00 at infinity, H is essentially self-adjoint on C°°(R) (Theorem 9.39) 
and has purely discrete spectrum (Theorem XIII.16 in Volume IV of [34]). 
The assumption that V’/V and V”/V be bounded near infinity is stronger 
than necessary, but still applies to most of the interesting cases. 

We refer to a(F) and b(£) as the turning points, since these are the 
points where a classical particle with energy EF changes direction. When 
the energy F is understood as being fixed, we will write the turning points 
simply as a and Db. 


15.4.1 The Classically Allowed Region 


As in Sect. 15.3, we seek approximate solutions to the time-independent 
Schrodinger equation having the following form in the classically allowed 
region: 





Aes {45 [oo ac} (15.13) 


where p(x) = \/2m(E — V(a)) is the momentum of a classical particle with 
energy FE and position x. According to (15.9), this form for w gives 








a (4) + +2A'(2)p(2) + 7H (2)Ale)) 


2m 


x exp (+5 [ro ac} (15.14) 


Since we want to obtain an approximate solution with an error smaller 
than h, we require that the second and third terms in parentheses in (15.14) 
cancel. This cancellation will occur if A satisfies 


2A'(x)p(x) = —p'(x) A(x) 
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FIGURE 15.2. A potential satisfying Assumption 15.3. 








or 
/ ' 
fafa 
which we can easily solve (Exercise 3) as 
A(x) = C(p(a))-V/?. (15.16) 
If A is given by (15.16), we will have 
Hy — Ey = re Th. (15.17) 


indicating that our error is of order f?. This expression, however, is only 
local, in that it applies only in the classically allowed region. Furthermore, 
p(x) tends to zero at the turning points, which means that A(a) becomes 
unbounded at these points. This blow-up of the amplitude is a substantial 
complicating factor in the analysis. 

We can get an approximate solution to the Schrodinger equation by tak- 
ing a linear combination of the function in (15.13) with two different choices 
for the sign in the exponent, with constants c, and cy. It is convenient to 
take the basepoint of our integration to be the left-hand turning point 
a = a(E). Furthermore, since the Schrédinger operator H commutes with 
complex conjugation, the real and imaginary parts of any solution to the 
time-independent Schrodinger equation is again a solution. We will there- 
fore consider only real-valued approximate solutions, i.e., those in which 
co = (. Using Exercise 1, we can then write our approximate solution as 
follows. 
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Summary 15.4 Suppose w is a real-valued solution to the time-independent 
Schrodinger equation. Then in the classically allowed region but away from 
the turning points, we expect that w is well approximated by an expression 


of the form 
R Lf 
me cos { = p(y) dy — st ‘ (15.18) 


where p(x) = ,/2m(E — V(x)) is the momentum of a classical particle with 
energy E and position x. Here R and 6 are real constants, referred to as 
the amplitude and the phase of the approximate solution. 





We refer to the function in (15.18) as the oscillatory WKB function. In 
integrating the square of the oscillatory WKB function over some interval, 
we may apply the identity cos? @ = (1 + cos(20))/2 to the cosine factor. 
The rapidly oscillating cos(20) term will be small for small fi because of 
cancellation between positive and negative values. Thus, the integral of 
w?(x) over an interval will be, to leading order, just a constant times the 
integral of 1/p(a), or, equivalently, a constant times 1/v(a), where v is 
the velocity of the classical particle. But the integral of 1/v(a) = dt/dx 
with respect to x is just the time t that the classical particle spends in the 
interval. We obtain, then, the following result. 


Conclusion 15.5 If the amplitude R in (15.18) is chosen so that ~ has 
L? norm 1 over [a,b], then the probability of finding the quantum particle in 
an interval [c,d] C [a,b] is approximately the fraction of time the classical 
particle spends in [c,d] over one period of classical motion. 


15.4.2. The Classically Forbidden Region 
In the classically forbidden region, let us introduce the quantity 
q(x) := VW 2m(V (a) — EF). 
We look for approximate solutions to the Schrédinger equation (15.6) of 
the form 


w(a) = Aladexp fee [at au}. 


(0) 





If we analyze approximate solutions of this form precisely as in the classi- 
cally allowed region, we again find that there is a unique choice for A (up 
to multiplication by a constant) that causes the order-f terms in Hw — Ey 
to cancel, namely A(x) = C(q(x))~'/?. If we are hoping to approximate a 
square-integrable solution of the Schrédinger equation, we want to take a 
minus sign in the exponent on the interval (b, 00), and it is convenient to 
the basepoint of our integration to be b. In the region (—oo, a), we want to 
take a plus sign in the exponent; it is then convenient to take the basepoint 
of our integration to be a and to reverse the direction of integration, which 
changes the sign in the exponent back to being negative. 
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FIGURE 15.3. The WKB functions, extended all the way to the turning points. 


Summary 15.6 [f,(x) is a solution to the time-independent Schrodinger 
equation that tends to zero as x approaches —oo, we expect that w, will be 
well approximated on (—o0,a), but away from the turning point, by the 
expression 


sae ew {5 q(y) ay}. (15.19) 


where q(x) = \/2m(V(x) — E). Meanwhile, if y2(x) is a solution to the 
time-independent Schrodinger equation that tends to zero as x approaches 
+o0, we expect that w will be well approximated on (b, +00), but away from 
the turning point, by the expression 


a ef f q(y) ay}. (15.20) 


q(z) 





We refer to the functions in (15.19) and (15.20) as the exponential WKB 
functions. The general theory of ordinary differential equations tells us that 
any solution to the time-independent Schrodinger equation for a smooth 
potential is smooth. Thus, the singularity at the turning points is an artifact 
of our approximation method. Nevertheless, for small values of h, the true 
solution will “track” the WKB approximation until x gets very close to 
the turning point, with the result that the true solution will be large, but 
finite, near the turning points. 

Figure 15.3 plots a potential function V(a), an energy level EF, and the 
WKB functions in both the classically allowed and classically forbidden 
regions. In the figure, the WKB functions have been (improperly) used all 
the way up to the turning points. 
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15.5 The Airy Function and the Connection 
Formulas 


For any constant c; and any energy level E, we expect that there is a unique 
solution wv, of the Schrédinger equation (15.6) that is well approximated 
for x tending to —oo by a function of the form (15.19). We expect that this 
solution will be well approximated in the classically allowed region (but 
not too close to the turning points) by a function of the form (15.18) for 
a unique pair of constants R and 6. In this section, we will see that the 
correct choices for R and 6 are 

f=ty, pS Zi (15.21) 
The formula (15.21) for R and 6 is called a connection formula; there is a 
similar formula connecting an approximate solution that tends to zero as x 
tends to +oo to an approximate solution in the classically allowed region. 
By comparing the two connection formulas, we will obtain conditions on 
the energy & under which the two approximate solutions (one that decays 
near —oo and one that decays near +co) agree up to a constant in the 
classically allowed region. The condition on F will turn out to be precisely 
Condition 15.1. 

The discussion in the previous paragraph should be compared to the 
analysis in Chap. 5, where we determined the constants for the solution 
inside the well in terms of the energy level and the constant in front of 
the exponentially decaying solution outside the well. Here, of course, the 
analysis is more complicated because neither of the approximations (15.19) 
or (15.18) is valid near the turning point. The connection formula will be 
obtained, then, by using the Airy equation to approximate the Schrédinger 
equation near the turning points. 

To get a reasonable approximation of our wave function near the turning 
points, we approximate V locally by a linear function. (By contrast, in the 
WKB functions, we are essentially thinking of V as being locally constant.) 
Thus, for example, near the turning point a, we write V(x) © (a—2)Fo, 








where Fy = —V'(a), yielding the approximate equation 
nh? dw 
Fow =0. 
2m dx? a 
By making the change of variable 
ImFy \ 1/3 
u= ( Rp ) (a— 2) (15.22) 


we can reduce the equation to 


aap 


ae 7 Ubu) = 0, (15.23) 
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which is the Airy equation. 

Equation (15.23) has two linearly independent solutions, denoted Ai(w) 
and Bi(u). We are interested in the solution Ai(w), since this is the one 
that decays for u > 0, that is, for « < a. The function Ai(u) is defined by 
the following convergent improper integral 


co 3 
Miia | cos (5 + ut) dt. (15.24) 
0 


Intuitively, convergence is due to the very rapid oscillation of the integrand 
for large t, which produces a cancellation between the positive and nega- 
tive values of the cosine function. Rigorously, convergence can be proved 
using integration by parts, as in Exercise 6. By differentiating under the 
integral sign (Exercise 7), one can show that Ai indeed satisfies the Airy 
equation (15.23). 

As |u| gets large, the integrand in (15.24) becomes more and more rapidly 
oscillating, producing more cancellation. The only exception to this behav- 
ior is when the derivative (with respect to t) of the function t?/3+ut is zero. 
Near such a point, the argument of the cosine function is changing slowly 
and there is little oscillation. If u is negative, there is a unique critical point 
of t?/3+ ut, at t = /—u, and we expect that the main contribution to the 
integral in (15.24) will come from t ~ \/—u. If u is positive, t?/3+ut has no 
critical points, and we expect that the integral in (15.24) will become quite 
small as u tends to +00. This sort of reasoning can be used to determine 
the precise asymptotics of the Airy function as u tends to +00 and as u 
tends to —oo; see the discussion following (15.32) and (15.33). 

We now state our main result, which will be derived in the remainder of 
this section. The result is not rigorous, because we have not estimated any 
of errors involved; such error estimates will be performed in Sect. 15.6. 


Claim 15.7 [fw is a solution of the Schrodinger equation (15.6) that 
tends to zero near —oo, then y, can be normalized so that the following 
approximations hold 


a 





q(y) ay (near — co) (15.25) 


the) ® en al 


T m ye 
Wi(a) & sayin Ai ((? a (a — “| (nearzx =a) (15.26) 





W(x) © as eo ts fm dy — I (a<a<b). (15.27) 


P 


Here Fy = —V'(a) and in the case of (15.27), x should not be too close to 
a or to b. 
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Similarly, if we is a solution of the Schrodinger equation (15.6) that 
tends to zero near +00, then W2 can be normalized so that the following 
approximations hold 





b 
Wo(@) & : cos} [ +o (a<au<b) (15.28) 


p(z) 


fia 14m We 
13001 aw ( (7 ) 0) (near x =b) (15.29) 


1 1 f[* 
Wo(x) & aa | a q(y) ay\ (near +00). (15.30) 


Here Fi = V'(b) and in the case of (15.28), x should not be too close to a 
or to b. 

The approximate formulas for yw and w2 will agree, up to multiplication 
by a constant, in the classically allowed region if and only if we have 


t [ oe) dx = (w + 5) 1 (15.31) 


for some non-negative integer n. 








More specifically, (15.27) and (15.28) are equal when the integer n in 
(15.31) is even and they are negatives of each other when n is odd. Note 
that there is a factor of 2 in the denominator in (15.25) but not in (15.27); 
this factor accounts for the expression R = 2c; in (15.21). 

Since the classical energy curve consists of two “branches,” of the form 
(x, p(a)) and (a, —p(a)), the compatibility condition (15.31) is equivalent 
to Condition 15.1. Since the phase of the approximate wave function in 
the classically allowed region is given by 1/h times the integral of p dz, 
the condition (15.31) says that the wave function goes through a little 
more than n half-cycles between the two turning points, where a half-cycle 
corresponds to a change in the phase in the amount of 7, or the interval 
between two critical points of the wave function. In particular, the wave 
function has exactly n+1 critical points inside the classically allowed region. 
The first and last critical points occur slightly inside the turning points, 
leaving a change in phase of roughly 7/4 between the extreme critical point 
and the turning point. 

Figure 15.4 considers the same potential as in Fig. 15.3. The figure shows 
the WKB functions (15.25) and (15.27), together with the scaled Airy func- 
tion (15.26), near the turning point « = a. Note that there is a good match 
between the WKB functions and the scaled Airy function when z is close 
to, but not too close to, the turning point. Meanwhile, Fig. 15.5 then shows 
the full approximate wave function with A chosen so that (15.31) holds 
with n = 39, obtained by using the WKB functions away from the turn- 
ing points and the scaled Airy functions near the turning points. Finally, 
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FIGURE 15.4. Plots of the scaled Airy function (thick curve) and the WKB 
functions, near the turning point x = a. 





























FIGURE 15.5. The approximate wave function with n = 39. 


Fig. 15.6 shows the probability distribution associated to the approximate 
wave function, plotted together with the function 1/p(x). (Compare the 
discussion preceding Conclusion 15.5.) 

We now derive the results in Claim 15.7. The Airy function Ai(u) is 
known to have the following asymptotic behavior: 











1 2 
Ai(u) © Trai exp { suit u —> +00, (15.32) 
and 
: 1 2 7 
Ai(u) & Taw cos (F( u)>/? +) u— —00. (15.33) 


For u tending to —oo, the asymptotics in (15.33) can be obtained by a 
straightforward application of the “method of stationary phase,” as ex- 
plained in Exercise 9. For u tending to +00, repeated integrations by parts 
(Exercise 8) show that Ai(u) decays faster than any power of u, which is all 
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FIGURE 15.6. The probability distribution of the approximate wave function, 
plotted against the function 1/p(x). 


that is strictly required for the main theorem of Sect. 15.6. To obtain the 
precise asymptotics in (15.32), one should deform the contour of integra- 
tion to obtain a different integral representation of Ai(w), and then apply 
some variant of the method of stationary phase, such as Laplace’s method 
or the method of steepest descent. See Sect. 4.7 of [30] for one approach to 
this analysis. 

We will use the Airy function on an interval around the turning points 
with a length that goes to zero as h tends to zero (so that the linear 
approximation to the potential gets better and better) but with a length 
that is large compared to f?/° (so that the value of u at the ends of the 
interval will be large, putting us into the asymptotic region of the Airy 
function). See Sect. 15.6 for more information. 

We use the linear approximation V(x) = (a — x) F to the potential near 
x = a, where Fp = —V’(a), which turns the Schrédinger equation (15.6) 
into the Airy equation, as previously noted. Now, the linear approximation 
to V yields 


pe /2mFoVx—a (15.34) 


and 


” V2zm xr—-a 3/2 
if p(y) dy = ae = = (-u)9?2 (15.35) 


From here it is a simple matter to check, using (15.33), that 


Vr 5 a 1 1 x - 
Omronve i) ~ ca | p(y) dy — *) 


for x > a, where the approximation holds in an intermediate region where 
x is close to a but not too close to a. Thus, if we scale our solution 1 to 
the Schrédinger equation so that it is approximated by 2!/?(2mFoh)~\/° 
times Ai(u) near « = a, it should satisfy (15.27) in the classically allowed 
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region (but away from the turning points). It is then straightforward to 
verify, using (15.32), that this multiple of Ai(w) satisfies (15.25) for 7 near 
—oo. The analysis of w2 is entirely similar. 

Finally, to compare the approximations (15.27) and (15.28), we note that 


tf ow dy +7 = (fo) ay~3) — ¢, 


b 
o= a) p(y) dy — 1/2. 


Now, if ¢ is an odd multiple of 7, then cos(@ — ¢) = —cos@ and if ¢ is 
an even multiple of 7, then cos(@ — ¢) = cos@. For all other values of ¢ 
(Exercise 4), cos(@ — ¢) is not a constant multiple of cos@. Thus, (15.31) 
is a necessary and sufficient condition for the two approximate solutions to 
agree up to a constant in the classically allowed region. 


where 


15.6 A Rigorous Error Estimate 


The preceding sections give a treatment of the WKB approximation that is 
typical of many books in the literature. This treatment gives the idea that 
energies FE’ satisfying the corrected Bohr-Sommerfeld Condition (Condi- 
tion 15.1) should be approximate eigenvalues for the Hamiltonian operator 
H , without specifying the sense in which this approximation holds. In this 
section, we prove a rigorous estimate, as follows. 


Theorem 15.8 For any potential V and range [E1, E2| of energies sat- 
isfying Assumption 15.3, there is a constant C such that the following 
holds. For any energy E € [|E\, E2] satisfying Condition 15.1, there exists 
a nonzero function x belonging to Dom(H) such that 


|b — By|| < Ca” |||). (15.36) 


As noted already in Sect. 15.3, an estimate of the form || Hy — Eu|| < 
e||w|| implies that there is a point E in the spectrum of H with |E — 
E| < e. (See Exercise 4 in Chap. 10.) Since, under our assumptions on V, 
the spectrum of H is purely discrete, we conclude that for each number 
E € [E,, E5] satisfying Condition 15.1, there is an actual eigenvalue E for 
AT with 


R= Place. (15.37) 
If F satisfies Condition 15.1, then the estimate (15.37) actually holds 


with 9/8 replaced by h? on the right-hand side. It is not, however, pos- 
sible to obtain such an optimal estimate by the methods we are using 


15.6 A Rigorous Error Estimate 321 


in this chapter. Specifically, the approximate eigenvector w constructed 
in the proof of Theorem 15.8 does not satisfy an estimate of the form 
|| Hy) — Ew|| < Ch?. One can, however, construct an approximate eigenvec- 
tor by different methods—for example, the method in [31]—that satisfies an 
order-h? error estimate, for any EF satisfying the corrected Condition 15.1. 
Nevertheless, the error bound in (15.37) is small compared to the typical 
spacing between the energy levels, which is of order h. 

Recall, as we noted at the beginning of Sect. 15.4, that a Schrodinger 
operator with potential V that is smooth and tends to +00 at oo is 
essentially self-adjoint on C°°(R). The operator H in Theorem 15.8 is, 
more precisely, the unique self-adjoint extension of the Schrédinger operator 
defined on C'S°(R). 





15.6.1 Preliminaries 


Our construction of the approximate eigenfunction ~ will be essentially 
by the WKB approximation as outlined in Claim 15.7. That is to say, 
we will define 7 using scaled Airy functions near the turning points and 
by the standard WKB functions in the classically allowed and classically 
forbidden regions. There is, however, a difficulty with this approach, which 
is that at the boundary between different regions, the scaled Airy function 
does not exactly match the WKB functions, but only approximately. What 
this means is that if we define ~ by the WKB formula in, say, an interval 
of the form (—oo,a — €) and we define w by a scaled Airy function on 
(a—e€,a+e), then w% may be discontinuous at a — e. Even if we scale 
by a constant on one of these intervals to eliminate the discontinuity in w 
itself, the derivative of ~ will still probably be discontinuous. But if the 
derivative of w is discontinuous, w is not actually in the domain of H , and 
the left-hand side of (15.36) does not make sense. (Compare Sect. 5.2.) 

The condition that ~ be continuous is not just a technicality: If we 
did not worry about continuity of ~’, then we could always match the 
scaled Airy function to the WKB functions, just by multiplying the various 
functions by constants, regardless of whether or not the energy satisfies the 
corrected Bohr-Sommerfeld Condition. In that case, we would be claiming 
that any number E € [E1, Eo] is within Ch9/8 of an eigenvalue of H, which 
is false already for the harmonic oscillator. 

To work around the difficulty described in the previous paragraphs, we 
must put in a transition region over which we smoothly pass from one func- 
tion to the other, using the “join” construction described in Sect. 15.6.4. 
Thus, we define the function ~ in Theorem 15.8 as follows. We use the 
formulas in Claim 15.7 in the indicated intervals, except that multiply 
the functions (15.28), (15.29), and (15.30) by —1 when n is odd. We use 
the scaled Airy functions (15.26) and (15.29) on intervals of the form 
(a—e€,a+e) and (b—e,b+6), respectively, for some ¢ depending on fh in a 
manner to be determined later. We then put in four transition regions, each 
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FIGURE 15.7. The approximate eigenfunction ~, with the transition regions 
shaded. 


having length 6, where 6 also depends on A in a manner to be determined 
later. The first transition region, for example, is the interval (a—e—06, a—€«) 
between the first classically forbidden region and the first turning point. 
In each transition region, we change over smoothly from one function to 
another. See Fig. 15.7 for an illustration of the transition regions around 
the turning point x =a. 

Suppose Ho denotes the Schrodinger operator with potential V, with 
domain equal to C&°(R). Then, as we have noted, Ho is essentially self- 
adjoint, and we are letting H, which coincides with the adjoint operator 
fete denote the unique self- adjoint extension of Ho. Now, the domain of 
H@ consists of all functions 7 € L?(R) such that the Schrédinger operator, 
computed in the distributional sense, again belongs to L?(R). In particular, 
if w is smooth, then w~ belongs to the domain of H= Hi if and only if w 
is in L?(R) and —(h?/2m)w” + Vv is also in L?(R). 

Because of the joins, our approximate eigenfunction is w actually in- 
finitely differentiable on all of R. And since V(a) tends to +00 at -too, 
the exponential WKB functions (15.25) and (15.30) have rapid decay at 
infinity, which shows that 7 is in L?(R). Furthermore, for x near +00, the 
calculation (15.17) applies, with A(a) = Cq(x)~!/?. We obtain, after a 
short calculation, 








h? UL 
= aa (x) + V(x)Y(x) 
Rf 5 f Vie) \? 1 V"(c) 
5 ; 15.38 
2m (3 Ga — iVve@-8) aay 
Since V’/V and V’’/V are assumed to be bounded near infinity and (zx) 
tends to +00 at +oo, we see that the Schrodinger operator applied to w is 
bounded by a constant times ~ near infinity and is thus square integrable. 
This shows that w is in the domain H. 


In Sect. 15.6.2, we will take the width 2¢ of the region around the turning 
points to be of order fi!/?. In that case, the L? norm of our approximate 
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wave function is of order 1 (bounded and bounded away from zero) as h 
tends to zero, despite the blow-up of order h7!/® very near the turning 
points. Although this result is not hard to verify (Exercise 10), if anything, 
the norm would be blowing up as f tends to zero, which would only help 
us in showing that || Hy — Ey)|| is small compared to ||~]| . 

To prove Theorem 15.8, we must estimate the contributions to the quan- 
tity | w — Ey|| from four different types of regions: the classically allowed 
region, the classically forbidden regions, the regions near the turning points, 
and the transition regions. These estimates will occupy the remainder of 
this section, with the analysis in the transition regions being the most in- 
volved. In particular, it is essential that the derivative of scaled Airy func- 
tion almost match the derivative of the WKB function in the transition 
region, as in the second part of Lemma 15.9. 


15.6.2. The Regions Near the Turning Points 


We use a scaled Airy function in an interval around each turning point. 
[We use (15.26) near x = a and either (15.29) or the negative thereof near 
x = b, depending on whether n is even or odd.] We now verify that taking 
these intervals to have length of order h!/? will give satisfactory estimates. 
If w denotes one of the scaled Airy functions, then ~ satisfies a Schrodinger 
equation in which the potential V is replaced by a linear approximation V 
near one of the turning points, which means that 


Aw — Ey = (V(x) — V(a))v. (15.39) 


The difference between V(a) and its linear approximation V(«) grows at 
most quadratically with the distance from the turning point. Meanwhile, 
the asymptotics of the Airy function tell us that it can be bounded as 
|Ai(u)| < Cu-1/4. (This is terrible estimate for small u, but still true.) 
Now u, as defined in (15.22), is of order h~?/? times the distance to the 
turning point. Since, also, there is factor of h~!/® in (15.26) and the distance 
from the turning point is at most of order h!/?, we find that 


|b — Ed) < C(nE2)2A- 1/6 (h-2/8 R12) 1/4 — ORS 
over the interval around each turning point. Finally, if a function f satisfies 


|f| < D on an interval of length L, then the L? norm of f over that interval 
will be at most DVL. Thus, over the interval around the turning points, 


|| — Ey|| = O(n"/8A/4) = O(f9/8). 


15.6.8 The Classically Allowed and Classically Forbidden 
Regions 


The expression (15.38) for Hw — Ey, derived from (15.17), applies both in 
the classically allowed region and in the classically forbidden regions. Let us 
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consider first the classically allowed region. Although (15.38) is nominally 
of order h?, we use this expression on an interval whose ends get closer and 
closer to the turning point as A tends to zero. Since, also, the expression 
in (15.38) is blowing up at the turning points, the contribution to ||Hw — 
E || from this interval is of order larger than h?. 

We have taken the interval around the turning point to have length 2¢ 
that is of order h!/?, and we will also take (Sect. 15.6.4) the transition 
regions to have length 6 that is of order h!/?. Thus, we use the oscillatory 
WKB function on an interval of the form (a+ 7,b—y), where y = € +0 is 
of order h'/?. Now, the formula for w in the classically allowed regions has 
a factor of 1/\/p(x) times a bounded quantity (the cosine factor). Since 
V'(a) is assumed to be nonzero, V(x) — E behaves like a constant times 
(a — a) and so 1/,/p(x) behaves like a constant time (2 — a)~‘/4 for a 
approaching a, with similar behavior near the other turning point. 

Meanwhile, the more problematic term in (15.38) is the term having 
(V(x) — E)? in the denominator. Keeping in mind the 1/,/p blowup of w 
itself, this term behaves like (x — a)~°/4 as x approaches a. Thus, we may 
estimate the norm of H qw — Ew over the left half of the classically allowed 
region as 


a+y 


a+b) /2 
= O'R — (a+ 6)/2)2)2. 


1/2 
|| — Eyl| < cr? ( / (x —a)-9”? is) 


Since ¥ is of order fi!/?, the contribution to ||Hw — Ey|| from the interval 
(a+, (a+b)/2) will consist of a term of order h?h-7/8 = h®/8, plus lower- 
order terms. The estimate over the other half of the classically allowed 
region is similar. 

Meanwhile, in the first classically forbidden region, we also apply (15.38). 
By Assumption 15.3, V’/V and V”/V are bounded near infinity. Thus, 
V'/(V — E) and V"/(V — E) will also be bounded near infinity, and thus 
also bounded on (—oo, a— 1), since V — E is strictly positive on this interval 
and tends to +00 as x tends to —oo. We see, then, that the norm of Hy—Ew 
over (—0o, a — 1) is bounded by a constant times h? ||~| . 

The norm of Hy — Ey over an interval of the form (a —1,a—7) can be 
analyzed similarly to the classically allowed region. The estimates from this 
region are better, however, because of the exponentially decaying factor in 
the definition of the WKB function. Thus, the contribution to ||Ha — Eu| 
from the classically forbidden region (—oco, a— 7) is certainly no larger than 
order f®/8, and similarly for the other classically forbidden region. 
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FIGURE 15.8. The join of two functions over the interval [a, a+] (thick curve). 


15.6.4. The Transition Regions 


Given two smooth functions v, and w2 and some interval of the form 
[a, a + 6], we now define a “join” yy LU w2 of WY and We, where ~ LI wo(z) 
is equal to ~(x) for x < a and equal to w%2(x) for > a+, and where 
wW1 U v2 is smooth everywhere. Let y be a smooth function on [0, 1] that is 
identically equal to 0 in a neighborhood of 0 and identically equal to 1 in 
a neighborhood of 1. Then define w LU we by 


(ti U 2)(@) = dix) + (a(x) — dr (x))x((w — @)/9). 


(See Fig. 15.8.) By direct calculation, we have 


(H — BI)(1 Ube) = (Ady — Bn) U (Abe — Be) 
— =" (usa) — vi(a))x'(w —a)/8) 


— (Wal) — dala))x"(w—a)/6). (15.40) 


In our constructing our approximate eigenfunction, we use five different 
formulas in five different regions: the two classically forbidden regions, the 
classically allowed region, and the regions near the two turning points. Since 
none of these functions exactly matches the function in the next interval, 
we put in a total of four joins in order to produce a function that is in the 
domain of H. We choose the width 6 of the interval on which the join takes 
place to be of the same size as the intervals around the turning points, 
namely, order h!/?. 

The most critical case is the transition from the region near the turning 
points to the classically allowed region. Consider, for example, the scaled 
Airy function 71 in (15.26) and the oscillatory WKB function 9 in (15.27). 
There are two contributions to the mismatch between these two functions. 
First, there is a discrepancy between the Airy function and its leading- 
order asymptotics. Second, there is an error in the approximations (15.34) 
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and (15.35), which come from the discrepancy between the potential V(x) 
and its linear approximation V(x) near x = a. We need to consider both 
contributions to the mismatch in our estimation of q, — 2 and of wi, — ws. 


Lemma 15.9 Let ~ denote the scaled Airy function in (15.26), let wy 
denote the same function with the Airy function replaced by the right-hand 
side of (15.83), and let w denote the oscillatory WKB function in (15.27). 
If x — a is positive and of order h'/?, we have 


ws (2) — di(a)| = O(R®) 
[v1 (x) — Yo(a)| = O(R'®) 
and 
[wi (x) — Bi (a)| = O(F-°/*) 
Iv (x) — fo(a)| = O(R-°/*) 


Before giving the proof of this lemma, let us verify that these estimates 
are sufficient to control the contribution to || Hw — Ey|| from the transition 
region (a+¢,a+e+6) between the first turning point and the classically 
allowed region, where both ¢ and 6 are taken to be of order h!/?. We must 
consider each of the three lines in (15.40). The L? norm of the first line is 
of order at most fi®/8, by precisely the same argument as in Sect. 15.6.3. 

For the second and third lines, we recall that if a function f is bounded 
by C, then the L? norm of f over an interval of length L is at most CL. 
Since we are taking the length 6 of our transition interval to be of order 
h\/?, the L? norm of the second line of (15.40) is of order 


ae hn /8p/4 = 79/8. 
h 


Meanwhile, the contribution from the third line of (15.40) is of order 


ShenhSp/4 _— Airs. 


Thus, the contribution to ||Hy— Ey || from the transition region (a+e,a+ 
e +6) is of order at most fi®/8. 

The analysis of the transition between the classically allowed region and 
the region around x = 0 is entirely similar. The analysis of the transitions 
between the regions near the turning points and the classically forbidden 
regions is also similar, but much less delicate, because all of the functions 
involved are very small in the transition region. When (a — 2) is positive 
and of order fi!/?, for example, u, as defined in (15.22) will be of order h—!/® 
and so u?/? is of order h~!/*. Thus, the exponential factor in leading-order 
asymptotics of the Airy function for u > 0 will behave like exp(—Ch-\/*), 
which is very small for small fh, certainly smaller than any power of fi. Since 
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all the factors in front of the exponential will behave like fh to a power, the 
overall contribution to ||Hy — Ey|| from the transition between the region 
near the turning points and the classically forbidden region is smaller than 
any power of fh. Thus, none of the transition regions contributes an error 
worse that O(h9/®). 
Proof of Lemma 15.9. We consider only the estimates for the derivatives 
of the functions involved. The analysis of the functions themselves is similar 
(but easier) and is left as an exercise to the reader (Exercise 11). 

We begin by considering w, — #,. With a little algebra, we compute that 


on 7 ahh = —/m(2mFy)*/6R-5/6( Ai! (u) — Ai (u)) (15.41) 


where uw is as in (15.22) and where Ai is the function on the right-hand side 
of (15.33). 
Now, Ai(u) has an asymptotic expansion for u — —oo given by 


Ai(u) = Ai(u)(1 + Cu-3/? +---), 


and Ai’(u) has the asymptotic expansion obtained by formally differenti- 
ating this with respect to u. [See Eq. (7.64) in [30].] From this, we obtain 


Ai/(u) — Ai (u) = Ai (u)O((—u)~3/2) + Ai(u)O((—u)~3/2). (15.42) 


From the explicit formula for Ai, we see that Ai(u) is of order (—u)~1/4. 


Meanwhile, the formula ‘Ai (u) will contain two terms, the larger of which 
will be of order u‘/+. Thus, the slower-decaying term on the right-hand side 
of (15.42) is the first one, which is of order (—u)~5/4. Now, in the transition 
regions, u behaves like h~?/%h!/? = A-1/6, Thus, (15.42) goes like f5/?4 and 
so (15.41) goes like A~5/6+5/24 — 4-5/8 | as claimed. 

We now consider #, — 5. By direct calculation, the derivatives of o, 
and wz each consist of two terms, a “dominant” obtained by differentiating 
the cosine factor and a “subdominant” term obtained by differentiating the 
coefficient of the cosine factor. In the case of wh, the dominant term in the 
derivative may be simplified to 


= > ((2mFo)(c —a))'/4sin (Fw ~ 4 (15.43) 


According to Exercise 12, we have, when x — a is of order fi!/?, the 
estimates 


((2mFo)(a — x))'/4 = /p + VpO(h'/?) (15.44) 
and j i 
2 (-u)9/? = if pt dy Ol, (15.45) 
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Since the derivative of sin@ is bounded, a change of order fi!/* in the 
argument of a sine function produces a change of order h!/* in the value 
of the sine. Thus, if we substitute (15.44) and (15.45) into (15.43), we find 
that the difference between the dominant term in wi and the dominant 
term in yw} is 

= /pO(hY) + lower-order terms. 


Since \/p is of order (x — a)'/4 or hi/8, we get an error of order h-*/®, as 
claimed. 

Finally, the subdominant terms in the derivatives of v1 and wz are easily 
seen to be separately of order h—°/8. Thus, even without taking into account 
the cancellation between these terms, they do not change the order of the 
estimate. ™ 


15.6.5 Proof of the Main Theorem 


We have estimated the contributions to ||Hi — Ew|| from each type of 
region: classically allowed and classically forbidden regions, the regions 
around the turning points, and the transition regions. In each case, we have 
found a contribution that is of order at most f9/* ||q||. Thus, it remains 
only to verify that the constants in all estimates are bounded uniformly 
over the given range E, < E < FE» of energies. 

This verification is straightforward. Near the turning point « = a, for 
example, we need to estimate the difference between the potential V(«) 
and its linear approximation V(x) near 2 = a. As a consequence of the 
Taylor remainder formula, |V (2) — V(«)| will be bounded by C |a — al? /2, 
where C’ is the maximum of |V’(a)| over the interval from a to x. As E 
varies over [F, Ey], the set of points where we have to evaluate |V’’(zx)| 
will be bounded, meaning that C’ can be taken to be independent of EF, for 
F in such a range. 

Similarly, in the classically allowed region, the blow-up of 1/(V (x) — E)? 
near « = a(F) can be controlled by the minimum of |V’(y)| for y between a 
and xz. By assumption, |V’(x)| > 0 at all the turning points a(F) and b(F) 
with Fy < E < Ep, and thus, by continuity, in some neighborhood of that 
set of turning points. Thus, blow-up of 1/(V(x) — E)? will be controlled by 
the minimum of |V’(«)| on an interval of the form [a(£2) + a, a(E1) + a] 
for some small a > 0. The remaining details of this verification are left to 
the reader. 


15.7 Other Approaches 


The main complicating factor in the WKB approximation is the singular 
behavior near the turning points. The turning points, meanwhile, are only 
problematic because we are working in the position representation. The 
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turning points, after all, are the points on the classical trajectory where 
the position of the particle achieves a maximum or a minimum. If we were 
to work in the momentum representation, the points where the momen- 
tum achieves a maximum or a minimum would instead be the problematic 
points. A. Voros [42] has proposed working in the Segal-Bargmann repre- 
sentation (Sect. 14.4). In Voros’s analysis, there are no turning points and, 
thus, the analysis is much simpler. The problem with Voros’s approach is 
that he only gives an approximation to the wave function on the classical 
energy curve. Even in simple cases, Voros’s expression does not admit a 
holomorphic extension to the whole plane, but has branching behavior in- 
side the classical energy curve. Thus, Voros’s formula does not define an 
element of the quantum Hilbert space (which is a space of entire holomor- 
phic functions), let alone an element of the domain of the Hamiltonian. 

Nevertheless, it is possible to build approximate eigenfunctions as su- 
perpositions of coherent states, using formulas similar to those in Voros. 
This approach avoids dealing with turning points but still yields a rigorous 
eigenvalue estimate, with the same corrected Bohr-Sommerfeld condition 
as in Condition 15.1. See [31, 23, 7], or (in greater generality) [26]. 


15.8 Exercises 


1. Show that if c, is any complex number, then we have an identity of 
the form 


cje’? + Ge” = Roos(6 — 5) 
for some real numbers Ff and 6. 


2. Let A(x, p) = p?/2m + mw?x?/2 be the Hamiltonian for a harmonic 
oscillator having mass m and classical frequency w. Show that a pos- 
itive number FE satisfies the corrected Bohr-Sommerfeld condition 
(Condition 15.1) if and only if E is of the form (n+ 1/2)hw, where n 
is a non-negative integer. 


Note: In light of the results of Chap. 11, this calculation means that, 
in this very special case, the corrected Bohr-Sommerfeld condition 
gives the exact eigenvalues of the quantum Hamiltonian H. 


3. Suppose A and p are two nonzero, smooth functions satisfying (15.15). 
Show that A(x) = C(p(x))~!/? for some constant C. 


Hint: Think in terms of the logarithms of the functions involved. 
4. Show that cos(@ — 6), viewed as a function of 6, agrees, up to mul- 


tiplication by a constant, with cos(@ — 0’) if and only if 6 — 6’ is an 
integer multiple of 7. 
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If w is an eigenvector for H that is approximated by (15.25) near 
—co, one might hope to find an approximate expression for w in 
the classically allowed region by analytically continuing around the 
turning point in the complex plane. Even assuming V is analytic, 
however, it is fairly evident that analytic continuation in the upper 
half-plane does not give the same answer as in the lower half-planes. 
Nevertheless, one could use the average of the upper and lower half- 
plane results as a (totally nonrigorous) guess for the behavior of w in 
the classically allowed region. 


Show that the above approach gives the correct phase 6 in the con- 
nection formula (15.21) but is off by a factor of 2 in the amplitude R. 


Using integration by parts, show that the limit 


A 8 
lim cos (5 + ut) dt 
A—-++00 0 3 


exists. 


Hint: Multiply and divide by t? + u (avoiding points where t? + u = 0 
in the case u < 0). 


In this exercise, we sketch an argument that the Airy function in 
(15.24) satisfies the differential equation w’’(u) — uy(u) = 0. For 
the purposes of this exercise, let us say that ia f(t) dt = C if 
fe f(t) dt = C+g(A), where the function g is bounded and oscillates 
around an average value of zero. 


Assuming that it is legal to differentiate under the integral sign, verify 
that Ai(u) satisfies the stated equation. 


Hint: After differentiating under the integral, look for a term that 
can be integrated explicitly. 


Note: A more rigorous approach to this verification would be to in- 
tegrate by parts as in Exercise 6 and then differentiate under the 
integral. This approach is, however, a bit messier. 


By integrating by parts repeatedly in (15.24), show that Ai(u) decays 
faster than any power of u as u tends to +00. 

Hint: A key point is to show that the boundary terms in the integra- 
tion by parts vanish at every stage. After performing the integrations 
by parts, estimate the resulting integral by using the inequality 


1 2 1 1 
(t? + uy” (t? + 1)* yr-k? 





u> I, 


for some appropriate choice of k. 


9. 


10. 


11. 


12. 


15.8 Exercises 331 


(a) For u < 0, make the change-of-variable 7 = t/./—uwu in the 
integral formula for the Airy function, to obtain the expression 


wie) = YS fem (a(2—r)) ay 49 





Tv 


where a = (—u)°/?, 


oc 
~" 


Suppose f is a smooth function on [a, b] having a unique critical 
point 2p. Assuming that zo is in the interior of [a,b] and that 
f" (xo) 4 0, the method of stationary phase asserts that 








b 
; . ; Qn 1 
ia f(x) _ iaf (xo) ,bin/4 | 
x)e dx = g(xo)e e€ + O 

[a a ata *°(3) 
for a tending to +00, where the plus sign in the exponent is taken 
when f”(x9) > 0 and the minus sign is taken when f” (29) < 0. 
(See, e.g., Eq. (5.12) in [30].) 
Using this result, obtain the asymptotic formula (15.33). 


Hint: Divide the integral in (15.46) into an integral over [0,2] and an 
integral over [2,00). Use stationary phase for the first interval and 
integration by parts (as in Exercise 6) for the second interval. 


Let w be the approximate eigenfunction for H defined in the begin- 
ning of Sect. 15.6. Show that the norm of 7 is bounded and bounded 
away from zero as h tends to zero. 


Hint: First show that the L? norm of w over the intervals around 


the turning points goes like h~!/®h!/4. Then check that the functions 
p(x)~'/? and q(x)~\/? are square integrable near the turning points. 


By imitating the arguments in the proof of Lemma 15.9, prove the 
estimates for Ww, — v1 and 1 — we in the lemma. 


By writing V(x) as Fo(a—2) plus an error term of order (a—a)?, verify 
that the estimates (15.44) and (15.45) in the proof of Lemma 15.9 
hold in the transition region. (Assume that x — a is of order h!/? in 
the transition region.) 
Hint: The leading-order Taylor expansion of (1+2)* is 1+az+O(z?), 
for any real number a. 
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Lie Groups, Lie Algebras, and 
Representations 


An important concept in physics is that of symmetry, whether it be 
rotational symmetry for many physical systems or Lorentz symmetry in 
relativistic systems. In many cases, the group of symmetries of a system is 
a continuous group, that is, a group that is parameterized by one or more 
real parameters. More precisely, the symmetry group is often a Lie group, 
that is, a smooth manifold endowed with a group structure in such a way 
that operations of inversion and group multiplication are smooth. The tan- 
gent space at the identity in a Lie group has a natural “bracket” operation 
that makes the tangent space into a Lie algebra. The Lie algebra of a Lie 
group encodes many of the properties of the Lie group, and yet the Lie 
algebra is easier to work with because it is a linear space. 

In quantum mechanics, the way symmetry is encoded is usually through 
a unitary action of the group on the relevant Hilbert space. That is, we 
assume we are given a unitary representation of the relevant symmetry 
group G, that is, a continuous homomorphism of G into U(H), the group 
of unitary operators on the quantum Hilbert space H. Actually, since two 
unit vectors in H that differ only by a constant represent the same physi- 
cal state, we should more properly consider projective unitary representa- 
tions. A projective representation is a homomorphism of a group G into 
U(H)/U(1), where U(1) is the group of complex numbers of magnitude 1, 
thought of multiples of J in U(H). An ordinary or projective representa- 
tion of a Lie group gives rise to an ordinary or projective representation 
of its Lie algebra. The angular momentum operators, for example, form a 
representation of the Lie algebra of the rotation group. 
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Saying that, for example, the Hamiltonian operator of a quantum system 
is invariant under rotations means that H commutes with the relevant 
representation of the rotation group and thus also with the associated Lie 
algebra operators. This commutativity, in turn, implies that the eigenspaces 
for H are invariant under rotations. We will use this commutativity in 
Chap. 18 to help us in determining the energy eigenvectors for the hydrogen 
atom. 

In this chapter, we will make a brief survey of Lie groups, Lie algebras, 
and their representations. For our purposes, it suffices to consider matrix 
Lie groups, those that can be realized as closed subgroups of the group of 
n xX n invertible matrices. Inevitably, I have had to present some of the 
deeper results without proof. Proofs of all results stated here can be found 
n [21]. The results of this chapter will be put to use in Chap. 17, in our 
study of angular momentum, and in Chap. 18, in our study of the hydrogen 
atom. 


16.1 Summary 


In this chapter, we will consider a matrix Lie group G, which is, by defini- 
tion, a (topologically) closed subgroup of some GL(n;C), where GL(n; C) is 
the group of n x n invertible matrices with complex entries. To each such 
G, we will associate the Lie algebra g of G, where g is a real subspace of 
M,,(C), the space of all n x n matrices. We will see that G is automatically 
an embedded real submanifold of M/,,(C) and that g is the tangent space 
of G at the identity matrix. 

Now, g is not just a real vector space, but comes with a “bracket” opera- 
tion mapping g x g into g. Specifically, we will show that for all X and Y in 
g, the matrix XY — YX belongs again to g. Thus, we define our bracket by 
setting [X,Y] equal to XY — YX. As it turns out, the Lie algebra g, as a 
vector space with the bracket operation, encodes a lot of information about 
the group G. On the other hand, computing at the level of the Lie algebra 
is generally easier than computing at the group level, simply because g is 
a linear space. 

We will be interested in unitary representations of our group G, that is, 
continuous homomorphisms of G into U(H), the group of unitary operators 
on a Hilbert space. If we restrict attention, at first, to the case in which 
H is finite dimensional, then each representation I] of G gives rise to a 
representation 7 of the Lie algebra g of G. That is to say, 7 is a linear 
map of g into the space of linear maps of V to V, satisfying 7([X,Y]) = 
[7(X),7(¥Y)]. A deeper question is whether every representation 7 of g 
comes from a representation II of G. As it turns out, the answer in general 
is no, but the answer is yes if G is simply connected. 
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We may consider, for example, the case G = SO(3). This group is not 
simply connected. On the other hand, the Lie algebra so(3) of SO(3) is iso- 
morphic to the Lie algebra su(2) of SU(2), and SU(2) is simply connected. 
[That is, SU(2) is the “universal cover” of SO(3).] Thus, given a represen- 
tation 7 of so(3), there may or may not be an associated representation II 
of SO(3). Even if there is not, however, there is always a representation I’ 
of the group SU(2). 

In quantum mechanics, the vector e’?y represents the same physical 
state as w. Thus, it is natural to consider “projective” unitary representa- 
tions, that is, homomorphisms of G into the quotient group U(H)/{e’’J}. 
In the finite-dimensional case, each projective representation can be “de- 
projectivized” at the level of the Lie algebra g of G. We can then pass 
from the Lie algebra to the universal cover of G, that is, the simply con- 
nected group with Lie algebra g. In particular, in the finite-dimensional 
case, the irreducible projective unitary representations of SO(3) are in one- 
to-one correspondence with irreducible ordinary unitary representations of 
the universal cover SU(2) of SO(3). Although the Hilbert spaces of phys- 
ical systems are usually infinite dimensional, for compact groups such as 
SO(3), general unitary representations can be decomposed as direct sums 
of finite-dimensional ones. (See, e.g., Proposition 17.19 and the discussion 
following it.) 


16.2 Matrix Lie Groups 


Let M;,(C) denote the space of n x n matrices with complex entries. We 
identify M,,(C) with cr’, equipped with the usual topology. Thus, a se- 
quence A,,, in M,,(C) converges to a matrix A € M,,(C) if (Am) j~ converges 
to Aj, as m tends to infinity, for all 1 < j,k <n. Let GL(n;C) denote the 
general linear group, consisting of all invertible n x n matrices with com- 
plex entries. Then GL(n;C) forms a group under the operation of matrix 
multiplication. Furthermore, GL(n;C)—that is, the set of A € M,,(C) with 
det A # 0—is an open subset of M,,(C). Since M,,(C) is a complex vector 
space of dimension n2, it may be identified with C”” © R2””, Since GL(n; C) 
is an open subset of M/,,(C), it looks locally like R2”” and is therefore a real 
manifold of dimension 2n?. 


Definition 16.1 A subgroup G of GL(n;C) is closed if for each sequence 
Am, in G that converges to a matrix A, either A is again in G or A is not 
invertible. A matrix Lie group is a closed subgroup of some GL(n;C). 


A subgroup G of GL(n; C) is closed if it is topologically closed as a subset 
of GL(n;C)—but not necessarily as a subset of M,(C). We will see that 
each matrix Lie group is a real embedded submanifold of GL(n; C) and thus 
is a Lie group. 
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Definition 16.2 If G, and Gz are matrix Lie groups, then a Lie group 
homomorphism of G to Gz is a continuous group homomorphism of Gy 
into Gg. A Lie group homomorphism is called a Lie group isomorphism 
if it is one-to-one and onto with continuous inverse. Two matrix Lie groups 
are called isomorphic if there exists a Lie group isomorphism between 
them. 


Example 16.3 The real general linear group, denoted GL(n,R), is the 
group of invertible n x n matrices with real entries. The groups SL(n,C) 
and SL(n,R) are, respectively, the groups of complex and real matrices with 
determinant 1. They are called the special linear groups. 


Example 16.4 Ann x n matrix U € M,(C) is said to be unitary if 
U*U =UU* =I. A matrix U is unitary if and only if 


(Uv, Uw) = (v, w) 


for allv,w € C”. The group of unitary matrices is denoted U(n) and called 
the (nx n) unitary group. The special unitary group, denoted SU(n), 
is the subgroup of U(n) consisting of unitary matrices with determinant 1. 


The condition (U*U);, = 4% is equivalent to the condition that the 
columns of U form an orthonormal set in C”, as can be seen by direct 
computation. Geometrically, the condition U*U = I is equivalent to the 
condition that (Uv,,Uv2) = (v1, v2) for all v1, v2 € C”, ie, that U pre- 
serves the inner product on C”. By taking the determinant of the condition 
U*U =I, we see that |det U| = 1 for all U € U(n). 

In this, the finite-dimensional case, the condition U*U = I implies that 
U* is the inverse of U and thus that UU* = I. This result does not hold 
in the infinite-dimensional case. 


Example 16.5 Ann x n real matriz R € M,,(R) is said to be orthogonal 
if R™R= RR" =I. A matrix R is orthogonal if and only if 


(Rv, Rw) = (v, w) 


for all v,w € R”. The group of orthogonal matrices is denoted O(n) and 
is called the (nxn) orthogonal group. The special orthogonal group, 
denoted SO(n), is the subgroup of O(n) consisting of orthogonal matrices 
with determinant 1. 


As in the unitary case, the condition R'”R = I implies that RR'™ = I 
and that the columns of R form an orthonormal set in R”. Geometrically, 
a real matrix R is in O(n) if and only if (Rv, Rv2) = (vi, v2) for all 
U1, v2 € R”, ie., if and only if R preserves the inner product on R”. By 
taking the determinant of the condition R' R = I we see that det R = +1 
for all R € O(n). 

It is easy to verify that all the groups in Examples 16.3, 16.4, and 16.5 
are, indeed, subgroups of GL(n, C) and that they are closed. 
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Definition 16.6 A matrix Lie group G is connected if for all A,B EG 
there is a continuous path A : [0,1] > M,,(C) such that A(O0) = A and 
A(1) = B and such that A(t) lies in G for all t. A matrix Lie group G is 
simply connected if it is connected and every continuous loop in G can 
be shrunk continuously to a point in G. A matrix Lie group G is compact 
if it is compact as a subset of M,(C) 2 R2”’. 


By the Heine—Borel theorem (e.g., Proposition 0.26 of {12]), a matrix 
Lie group G is compact if and only if it is a closed and bounded subset 
of M,,(C). The condition we are calling “connected” is, more properly, the 
condition of being path connected. We will see, however, that each matrix 
Lie group is an embedded real submanifold of M,,(C) and is, therefore, 
locally path connected. For matrix Lie groups, then, connectedness and 
path connectedness are equivalent. 

To prove that a matrix Lie group G is connected, it suffices to prove that 
for all A € G, there is a continuous path in G connecting A to J. After all, 
if both A and B can be connected to J, then they can be connected to each 
other. 


Example 16.7 The groups O(n), SO(n), U(n), and SU(n) are compact. 


Proof. The conditions defining these groups are obtained by setting certain 
continuous functions equal to a constant. The group SU(n), for example, is 
defined by setting (U*U) ;, = 6; for each j and k and by setting det U = 1. 
These groups are thus closed not just as subsets of GL(n;C) but also as 
subsets of M,,(C). Furthermore, each of these groups has the property that 
each column of any matrix in the group is a unit vector. Thus, each group 
is a bounded subset of /,,(C). 


Example 16.8 The group U(n) is connected. 


Proof. If U € M,(C) is unitary, then U has an orthonormal basis of 
eigenvectors with eigenvalues of absolute value 1. Thus, there is another 
unitary matrix V (the change of basis matrix) such that 


eit 
ei2 
U=V yr". 
Ee! On 
for some real numbers 61, 62,...,4,. Thus, we can define a family U(t) of 
unitary matrices by setting 
eit 
cits 
U(t)=V ve, 
, eitOn 
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Then U(-) is a continuous path lying in U(n) with U(0) = J and U(1) =U. 
TI 


Example 16.9 The group SU(2) is simply connected. 


Proof. We claim that 


so {(5 2) 


It is easy to see that each matrix of the indicated form is indeed unitary and 
has determinant 1. On the other hand, if U is any element of SU(2), then 
the first column of U is a unit vector (a, 8) € C?. The second column of 
U must then be orthogonal to (a, 3). Since (—8, a) is orthogonal to (a, f) 
and C? is 2-dimensional, the second column of U must be a multiple of 
(—6,@). But the only multiple that produces a matrix with determinant 
lis 1. 

We see, then, that SU(2) is, topologically, the unit sphere S$? inside C? = 
R* and is, therefore, simply connected. m 





a,BEC, lal? +|B? = it. 


16.3 Lie Algebras 


We now introduce the general algebraic concept of a Lie algebra. Once this 
is done, we will show how to associate a real Lie algebra with an arbitrary 
matrix Lie group. 


Definition 16.10 A Lie algebra over a field F is a vector space g over 
F, together with a “bracket” map |-,-] : g x g > g having the following 
properties: 

1. [-,-] as bilinear 

2. [Y,X] =— [X,Y] for all X,Y eg 

3. [X,X]=0 for all X Eg 

4. For all X,Y,Z € g we have the Jacobi identity 


[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X,Y] =0. 


If the characteristic of F is not equal to 2, then Property 3 is a conse- 
quence of Property 2. If F = R, then we say that g is a real Lie algebra. An 
example of a real Lie algebra is the vector space R? with the bracket equal 
to the cross product. Properties 1, 2, and 3 are evident from the definition 
of the cross product, while the Jacobi identity is a known property of the 
cross product that can be verified by direct calculation. 

A large class of Lie algebras may be obtained by the following procedure. 
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Example 16.11 Let A be an associative algebra and let g be a subspace of 
A with the property that for all x,y in g, xy — yx is again in g. Then the 
bracket 

[ay] = wy — yx 


makes g into a Lie algebra. 


In Example 16.11, we may take, for example, g = A. It is evident that 
this bracket satisfies Properties 1, 2, and 3 of a Lie algebra, and the Ja- 
cobi identity is easily verified by direct calculation. As it turns out, every 
Lie algebra is isomorphic to a Lie algebra of this type. (This claim is a 
consequence of the Poincaré—Birkhoff—Witt theorem, which is proved, for 
example, in Sect. 5.2 of [25]. The algebra A in the Poincaré—Birkhoff-Witt 
theorem is the so-called universal enveloping algebra of g.) 


Definition 16.12 If g: and ge are Lie algebras, a map @ : g1 — ge is 
called a Lie algebra homomorphism if ¢ is linear and ¢ satisfies 


OX, ¥]) = [(X), oY) 


for all X,Y € gi. A Lie algebra homomorphism is called a Lie algebra 
tsomorphism if it is one-to-one and onto. 


Definition 16.13 If g is a Lie algebra, a subalgebra of g is a subspace h 
of g with the property that [X,Y] € 6 for all X and Y inh. An ideal in g 
is a subalgebra h of g with the stronger property that [X,Y] € 6 for all X 
ing and Y inh. 


The notion of a subalgebra of a Lie algebra is analogous to the notion 
of a subgroup of a group, while the notion of an ideal in a Lie algebra is 
analogous to the notion of a normal subgroup of a group. In particular, 
the kernel of any Lie algebra homomorphism is an ideal, just as the kernel 
of a group homomorphism is a normal subgroup. 


Definition 16.14 The direct sum of Lie algebras g, and gz, denoted 
G1 © Ga, ts the direct sum of g1 and gz as a vector space, equipped with the 
bracket given by 

[(X1, Yi), (Xo, Y2)] = ([X1, Xa], [Yay Y9]) 


for all X1, X2 € gi and Yi, Y2 € go. 


16.4 ‘The Matrix Exponential 


In the next section, we will associate a Lie algebra with each matrix Lie 
group. To describe this association, we need the notion of the exponential 
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of a matrix. Given a matrix X € M,,(C), we define the matrix exponential 
of X, denoted by e* or exp(X), by the usual power series, 


where X° = I (the identity matrix). This series converges absolutely for 
all X € M,(C), as can easily be seen using the inequality ||X™|| < ||X||’", 
where ||X|| is the operator norm of X; see Definition A.35. (In this, the 
finite-dimensional case, we could just as well use the Hilbert—Schmidt norm, 
which amounts to using the usual Euclidean norm on M,(C) & C”’. See 
Exercise 3.) The matrix exponential shares some but not all of the proper- 
ties of the exponential of a number. 


Theorem 16.15 The matriz exponential has the following properties for 
all X,Y € M,(C). 


2. eX” = (eX) and eX" = (eX)* 
3. If A is an invertible n x n matrix, then 


=i 
cAXxA = Ac* Am!. 


det (e* ) = etrace(X) 
If XY =YX then eXt+¥ = eXeY 


e* is invertible and (e*)~! =e-* 


2 S&S St oS 


Even if XY # YX, we have 


X4Y X/m_.¥/m\"™ 
e*tY = lim (c [te =) . 
m—- oo 


Here X' and X* denote the transpose and adjoint (conjugate transpose) 
of X, respectively. Property 7 is known as the Lie Product Formula and is 
a special case of the Trotter Product formula (Theorem 20.1). Properties 
1, 2, and 3 are easily verified using term-by-term computation. Property 6 
follows from Property 5 by taking Y = —X and applying Property 1. The 
proofs of Properties 4, 5, and 7 are outlined in Exercises 5, 6, and 7. 

Suppose a matrix X is diagonalizable, meaning that 
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for some invertible matrix A and complex numbers 1, A2,...,An. Then 
using Property 3 of Theorem 16.15, it is easy to see that 


If X is not diagonalizable, e* can be computed in terms of the SN decom- 
position of X. See Sect. 2.2 of [21] for details. 


0 a 
ol cai 
ofa cosa sina 
~ \ ~sina cosa }° 


Proof. The eigenvalues of X are tia and the corresponding eigenvectors 
are (1, +7). Thus, we may calculate that 


x. fi 1 et OQ 1 —i -1 
eSNG Hi 0 e@ }(ay\-i 1 
_ 1 ( —i(e 4 ee) —ei4 4 e7 ia ) 


— OF eit —e ia —i(e" af a) 


Example 16.16 /f 


then 








which simplifies to the desired result. m 
The relation eX+¥ = e*e” certainly does not hold for general (noncom- 
muting) matrices X and Y. Nevertheless, for any X € M,,(C) we have 
estoy x = esx et X 
for all s and t in R, since sX commutes with tX. Thus, for each X, the set 
of matrices of the form e'*, t € R, forms a subgroup of GL(n;C). It is not 
hard to show (Exercise 4), using term-by-term differentiation, that 


dex! oy. (16.1) 
dt a 


Here, the derivative of a matrix-valued function is defined as being entry- 
wise. [That is, if f(t) is a matrix-valued function, df /dt is the matrix-valued 
function whose (j,k) entry is d(f(t);x)/dt.] 


Definition 16.17 A one-parameter subgroup of GL(n;C) is a continu- 
ous homomorphism of R into GL(n;C), that is, a continuous map A: R—- 
GL(n;C) such that A(0) =I and A(s +t) = A(s)A(t) for all s,t € R. 
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Theorem 16.18 /f A(-) is a one-parameter subgroup of GL(n;C), there 
exists a unique X € M,,(C) such that 


A(t) = e&* 
for allt ER. 


This is Theorem 2.13 in [21]. 


16.5 The Lie Algebra of a Matrix Lie Group 


We now associate a Lie algebra g to each matrix Lie group G. 


Definition 16.19 If G C GL(n;C) is a matrix Lie group, then the Lie 
algebra g of G is defined as follows: 


g={XEM,(C) |e €G for allte R}. 


That is to say, X belongs to g if and only if the one-parameter subgroup 
generated by X lies entirely in G. Note that to have X belong to g, we 
need only have e'* belong to G for all real numbers t. 


Proposition 16.20 For any matrix Lie group G, the Lie algebra g of G 
has the following properties. 


1. The zero matriz 0 belongs to g. 

2. For all X in g, tX belongs to g for all real numbers t. 
8. For all X and Y ing, X + Y belongs to g. 

4. For all AE G and X € g we have AXA7' € g. 

5 


. For all X and Y ing, the commutator [X,Y] := XY — YX belongs 
to g. 


The first three properties of g say that g is a real vector space. Since 

M,,(C) is an associative algebra under the operation of matrix multipli- 
cation, the last property of g shows that g is a real Lie algebra (Exam- 
ple 16.11). 
Proof. Points 1 and 2 are elementary, and Point 3 follows from the Lie 
product formula, using the assumption that G is closed. Point 4 follows 
from Property 3 in Theorem 16.15. To verify Point 5, we observe that the 
commutator [X,Y] may be computed as 


d 
[X,¥)=—e*ve™| , 
dt = 
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using (4) and an easily verified product rule for differentiation of matrix- 
valued functions. For X,Y € g, e’*Ye~** belongs to g for all t € R, by 
Point 4. Furthermore, we have already shown that g is a real subspace of 
M,,(C) and therefore a closed subset of M,,(C). Thus, 


ehxXy hx _ Y 
[X,Y] = lim ———___— 
h—0 h 


belongs to g. 


Example 16.21 Let gl(n;C), gl(n;R), sl(n;C), andsl(n;R) denote the Lie 
algebras of GL(n;C), GL(n;R), SL(n;C), and SL(n;R), respectively. Then 


we have 





gi(n;C) = M,(C) 
gi(n;R) = M,(R) 
sl(n;C) = {X € M,,(C) |trace(X) = 0} 
sl(n;R) = {X € M,,(R) |trace(X) = 0}. 


Proof. Let us consider, for example, the case of sl(n;C). By Property 4 of 
Theorem 16.15, if trace(X ) = 0, then 


det(e'*) = ettrace(X) ==] 


b 


so that e'* € SL(n;C). In the other direction, if X € sl(n;C), then by 
the above calculation, we must have e!*t@°e(*) = 0 for all t € R, which is 
possible only if trace(X) = 0. The proofs of the other cases are similar and 
are omitted. m 


Example 16.22 The Lie algebras u(n) and su(n) of U(n) and SU(n) are 
given by 


u(n) = {X € M,(C)|X* =—-X} 
su(n) = {X € u(n) |trace(X) = 0}. 


The Lie algebra so(n) of SO(n) is given by 

so(n) = {X € M,(R)|X*” =—-X}. 
Finally, the Lie algebra of O(n) is equal to so(n). 
Proof. If X* = —X, then by Property 2 of Theorem 16.15, 








, 


(e*)* etx" —tX (er) 


showing that e’* is unitary. In the other direction, if e’* is unitary for all 
t € R, then (e'*)* = (e'*)-1 = e-**. Thus, e’*” = e*. Differentiating 
this relation at t = 0, using (16.1), gives X* = —X. Thus, the Lie algebra of 
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U(n) consists exactly of the matrices with the property that X* = —X. For 
the Lie algebra of SU(n), we add the trace-zero condition, as in the proof 
of Example 16.21. The calculations for SO(n) are similar and are omitted. 
Note that if X € M,,(R) satisfies X’" = —X, then the diagonal entries of X 
are zero and, thus, trace(X) is automatically 0. This observation explains 
why the Lie algebras of O(n) and SO(n) are the same. m 

Specializing Proposition 16.22 the case n = 3 gives 


0 a b 
so(3) = —a 0 c jja,b,cER 
—b -c 0 


We can use the following basis for so(3): 


0 0 0 0 1 0 
Fy := 0 0 -1l 5 Fo:= 0 0 0 |; F3:= 1 
0 1 -1 0 O 0 


So oO 
ooo 


(16.2) 


Direct calculation establishes the following commutation relations for the 
Be 7S 


[Fi, Fo] = Ps 

[Fo, F3] = Fi 

[F3, 4] =F. (16.3) 
More concisely, we have [F1, F2] = F3, together with relations obtained 


from this one by cyclic permutation of the indices. Note that all remaining 
commutation relations follow from (16.3) by means of the skew-symmetry 
of the bracket; we have, for example, [F2, Fi] = —F3 and [Fi, Fi] = 0. 


16.6 Relationships Between Lie Groups and Lie 
Algebras 


In this section, we explore the relationships between matrix Lie groups and 
their Lie algebras. In particular, we investigate the question of the extent 
to which a matrix Lie group is determined (up to isomorphism) by its Lie 
algebra. We begin by showing that every Lie group homomorphism gives 
rise to a Lie algebra homomorphism in a natural way. 


Theorem 16.23 Suppose G, and Gz are matrix Lie groups with Lie al- 
gebras g, and gz, respectively, and suppose ® : G, > G2 is a Lie group 
homomorphism. Then there exists a unique linear map @ : g1 + gz such 


that 
d(e'*) = eto(X) 
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for allt € R and X € g. This linear map has the following additional 
properties: 


1. o([X, Y]) = [6(X), e)] for all X,Y € g 
2. ¢(AX A) = ®(A)¢(X)®(A)~+ for all AE G and X Eg 


3. &(X) may be computed as 


Point 1 shows that ¢ is a Lie algebra homomorphism. Part of the assertion 
of Point 3 of the theorem is that 6(e’* ) is a smooth function of t for each X. 

To construct ¢, note that since ® is a continuous homomorphism, the 
map t ++ ®(e'*) is a one-parameter subgroup. By Theorem 16.18, there 
exists a unique Y such that ®(e'*) = e'Y for all t € R. We then set 
¢(X) = Y. An argument similar to the proof of Proposition 16.20 then 
establishes the desired properties of ¢. See the proof of Theorem 2.21 in 
[21] for the details. 


Corollary 16.24 Suppose that G; and G2 are matrix Lie groups with Lie 
algebras gi and go, respectively. If G1 is isomorphic to Go, then gy is iso- 
morphic to go. 


Proof. See Exercise 11. m 

Our next task is to show that for any matrix Lie group G, the Lie algebra 
g of G is large enough to capture what is happening in a neighborhood of 
the identity in G. This will show, for example, that for connected matrix 
Lie groups, a Lie group homomorphism is determined by the corresponding 
Lie algebra homomorphism. 


Theorem 16.25 Let G be a matriz Lie group with Lie algebra g. Then 
there exists a neighborhood U of 0 in M,(C) and a neighborhood V of I in 
M,,(C) such that the matrix exponential maps U diffeomorphically onto V 
and such that for all X € U, we have that X belongs to g if and only if ex 
belongs to G. 


See Theorem 2.27 in [21]. This result has a number of important conse- 
quences. 


Corollary 16.26 Every matrix Lie group GC GL(n;C) is a real embedded 
submanifold of M,(C) with the dimension of G equal to the dimension of 
g as a real vector space. 


The claim means, more precisely, that for each A € G, there exists a 
neighborhood U of A and a diffeomorphism ® of U with a neighborhood 
V of 0 in R®”’ such that 6(UN G) = VOR, where d = dimg. That is to 
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say, after a change of coordinates, G “looks” locally like a little piece of R? 
sitting inside M,,(C) © R2”’. 

Proof. We use exponential coordinates in the neighborhood V of I in 
M,(C), meaning that we write each element A of V as A = e*, with 
X € U. Theorem 16.25 says that near the identity, in these coordinates, G 
“looks like” the real vector space g inside M,,(C). Given any other point 
A € G, we can use left multiplication by A~! to move the action to the 
identity (Exercise 17), with the result that G looks like g C M,(C) near A. 
Thus, G is a real embedded submanifold of dimension d= dim g. @ 


Corollary 16.27 The Lie algebra g of a matrix Lie group G is the tangent 
space toG at I. That is to say, g coincides with the set of those X in M,,(C) 
for which there exists a smooth curve y: R — M,(C) lying entirely in G 
and such that y(0) =I and y'(0) = X. 


Proof. If X € g, then X is the derivative of e’* at t = 0, so g is contained 
in the tangent space at J. In the other direction, if y is any smooth curve 
in M,,(C) that lies entirely in G and passes through J at t = 0, then by 
Theorem 16.25, we can express y as y(t) = e° (at least for small t), where 
6 is a smooth curve in g with 6(0) = 0. It is then easy to see (Exercise 8) 
that y'(0) = 6’(0). But if 6 lies in g, then 6’(0), which equals y/(0), also lies 
in g, as in the proof of Proposition 16.20. Thus, the tangent space at I is 
contained in g. @ 


Corollary 16.28 Jf a matrix Lie group G is connected, then for all AG G 
there exists a finite sequence X1,X2,...,Xn of elements of g such that 


A= erte*%? ..- eX, 


Proof. If G is connected in the sense of Definition 16.6 (which really means 
that G is path connected), then G is certainly connected in the usual topo- 
logical sense of having no nontrivial sets that are both open and closed. 
Let U denote the set of points in G that can be expressed as a product 
of exponentials of elements of g. This set is open in G because if A € U 
and B € G is close to A, then A~'B is close to J in G, and therefore 
A~!B =e* for some X € g. Thus, B = Ae*, which means that B is also 
a product of exponentials. In the other direction, if B € G is in the closure 
of U, then there is some element A of U that is close to B. We then have, 
again, that B = Ae* for some X € g, which, again, means that B € U. 
Now, G is connected and U is both open and closed. Since U is nonempty 
(I €U), we have U=G. @ 


Corollary 16.29 Suppose that G, and G2 are matrix Lie groups with 
Lie algebras g; and go, respectively. Suppose that ®; : Gy — G2 and 
®2 : Gy > Go are Lie group homomorphisms, with associated Lie algebra 
homomorphisms $1 and $2, respectively. If G1 is connected and ¢1 = ¢2, 
then ®, = Po. 
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Proof. The result follows from Corollary 16.28 and the condition ®;(e*) = 
efi(X) 7 =1,2. m 

We have seen that a homomorphism of matrix Lie groups gives rise to a 
homomorphism of the associated Lie algebra, and (Corollary 16.29) that if 
the domain group is connected, the Lie algebra homomorphism determines 
the Lie group homomorphism. A more difficult question is whether we can 
go in the opposite direction, from a Lie algebra homomorphism to a Lie 
group homomorphism. That is to say, given a Lie algebra homomorphism 
between the Lie algebras of two matrix Lie groups, does there exist a Lie 
group homomorphism related in the usual way to the Lie algebra homomor- 
phism? The answer turns out to be yes, provided that the domain group 
G is connected and simply connected (i.e., that every continuous loop in 
G, can be shrunk continuously in G; to a point). 


Theorem 16.30 Suppose that G, and G2 are matrix Lie groups with Lie 
algebras gi and go, respectively, and suppose that @ : g1 — ge is a Lie 
algebra homomorphism. If G, is connected and simply connected, then 
there exists a unique Lie group homomorphism ® : Gy > G2 such that ® 
and @ are related as in Theorem 16.28. 


One way to prove this deep result is to make use of the Baker-Campbell-— 
Hausdorff formula. (See, e.g., Chap. 3 of [21].) This formula states that for 
all sufficiently small X and Y in M,,(C) we have 

exeY = eXtYt 3X YH SIX YI- BIO I+- 

Here --- denotes terms that are expressible in terms of repeated commu- 
tators involving X and Y, with coefficients that are “universal,” that is, 
independent of n (the size of the matrices) and of the choice of X and Y in 
M,,(C). Given a Lie algebra homomorphism ¢ : gi — gz, one can use the 
Baker—Campbell—Hausdorff formula to construct a “local homomorphism,” 
mapping a neighborhood of the identity in G; into Gg. If G1 is connected 
and simply connected, it is possible to extend this local representation to a 
global representation. See Sect. 3.6 of [21] for the details of this construc- 
tion. 


Corollary 16.31 Suppose that G; and G2 are matrix Lie groups with Lie 
algebras gi and go, respectively. If G1 and G2 are connected and simply 
connected and g, is isomorphic to gz, then Gy is isomorphic to G2. 


Proof. Suppose ¢ : gi — ge is a Lie algebra isomorphism. Since Gj, is 
connected and simply connected, there exists a Lie group homomorphism 
® : G, — Gz» related in the usual way to ¢. Since G2 is connected and 
simply connected, there exists a Lie group homomorphism V : Gz > Gy 
related in the usual way to @~!. Consider now the homomorphism V o @ : 
Gy —> G\. 
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By the composition property of Lie algebra homomorphisms (Exercise 10), 
the Lie algebra homomorphism associated with Vo is 10g = I. It then 
follows from Corollary 16.29 that Vo® = J. A similar argument shows that 
®oW =T, which means that ® is a Lie group isomorphism. @ 

Corollary 16.31 does not hold without the assumption that both groups 
are simply connected, as the following important example shows. 


Example 16.32 The Lie algebras su(2) and so(3) are isomorphic, but the 
groups SU(2) and SO(3) are not isomorphic. 


Since SU(2) is simply connected (Example 16.9), SO(3) must fail to be 
simply connected. Indeed, 7(SO(3)) = Z/2, as can be seen from Exam- 
ple 16.34. 

Proof. The Lie algebra su(2) of SU(2) is the space of 2 x 2 skew-self-adjoint 
matrices with trace zero. Explicitly, 


_ ia b+ic 
su(2) = { ( —b+ic —ia ) 


We may consider the following basis for su(2): 


1/i 0 1/01 1/0 i 
B=5( 4 ae B=5( a B=5( 4 5) (06a) 


Direct calculation shows that [£1, £2] = E3 and relations obtained from 
this by cyclic permutation of the indices. These are the same relations as 
those satisfied by the basis elements F;, 7 = 1,2,3, for so(3) in (16.2) 
and (16.3). Thus, there is a Lie algebra isomorphism ¢ : su(2) — so(3) such 
that ¢(E;) =F), j =1,2,3. 

On the other hand, there can be no isomorphism between SU(2) and 
$O(3), since SU(2) has a nontrivial center (containing at least J and —I), 
whereas the center of SO(3) is trivial (Exercise 14). m 





acer}. 


Definition 16.33 Suppose G is a connected matrix Lie group with Lie 
algebra g. A universal cover of G is an ordered pair (G,®) consisting 
of a simply connected matrix Lie group G and a Lie group homomorphism 
®:G-—G such that the associated Lie algebra homomorphism ¢: 9 > g 
is an isomorphism of the Lie algebra g of G with g. The map ® is called 
the covering map for G. 


Although each Lie group has a universal cover that is again a Lie group, 
the universal cover of a matrix Lie group may not be isomorphic to any 
matrix Lie group. [The universal cover of SL(2;R), e.g., is not a matrix Lie 
group.] It can be shown, however, that if a matrix Lie group G is compact, 
then the universal cover of G' is again a matrix Lie group (not necessarily 
compact). _ 

Suppose G is any simply connected Lie group with a Lie algebra g that 
is isomorphic to g. The choice of a particular isomorphism ¢ : g > g gives 
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rise, by Theorem 16.30, to a Lie group homomorphism ® : G > G, so that 
(G, ®) is a universal cover of G. 

If (G,®) is a universal cover of G, it is often convenient to use the 
isomorphism ¢ to identify g with g. If we follow this convention, we may 
say that a universal cover of G is a simply connected group G having “the 
same” Lie algebra as G. 

If (G1, ®) and (G2, ®2) are two universal covers of a given matrix Lie 
group G, then there is a unique Lie group isomorphism W : G1 > G2 such 
that ®((A)) = ®,(A) for all A € G,. (This result follows easily from 
Corollary 16.31.) In light of this uniqueness result, we will often speak of 
“the” universal cover of G. 


Example 16.34 Let ® : SU(2) + SO(3) be the unique Lie group homo- 
morphism for which the associated Lie algebra homomorphism @ satisfies 
O(£;) = Fj, 7 =1,2,3. Then ker ® = {I,—I} and (SU(2), ®) is a universal 
cover of SO(3). 


Proof. Since FE is diagonal, it is easy to see that e?""1 = —I in SU(2). 
On the other hand, by a trivial extension of Example 16.16, we have 


1 0 0 
0 cosa —sina 
QO sina cosa 


aFy, = 


for all a € R. In particular, e?""1 = I. Thus, 
&(—1) = G(e?"*1) =e?" = 7, 


This shows that —I belongs to the kernel of ®. 

Now, since ¢ is injective, ® is injective in a neighborhood of J. After all, 
given distinct elements A and B of SU(2) near I, Theorem 16.25 tells us 
that we can express A as e* and B as e*, with X and Y being distinct 
small elements of su(2). Then ¢(X) and ¢(Y) are distinct small elements 
of so(3). Applying Theorem 16.25 again tells us that 6(A) = e®* and 
(B) = e%) are distinct. 

We see, then, that ker ® is a discrete normal subgroup of SU(2). But a 
standard exercise (Exercise 1) shows that a discrete normal subgroup of a 
connected group is automatically central. On the other hand, it is easily 
verified (Exercise 2) that the center of SU(2) is {Z, —I}, so ker ® cannot be 
larger than {I, —J}. 

To show that ® maps onto SO(3), we first verify (Exercise 13) that each 
element R of SO(3) can be expressed as R = e*, with X € so(3). Since ¢ 
is surjective and &(e* ) = e?(*), ® maps onto SO(3). m 
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16.7 Finite-Dimensional Representations of Lie 
Groups and Lie Algebras 


A representation of a group G is a homomorphism I of G into GL(V), 
the group of invertible linear transformations on some vector space. If II 
is injective then G is isomorphic to its image under II; thus, I] serves to 
“represent” G concretely as a group of invertible linear transformations. 
(We continue to use the term “representation” even if II is not injective.) 
Similarly, a representation of a Lie algebra g is a Lie algebra homomorphism 
of g into gl(V), the space of all linear transformations of V, where we equip 
gl(V) with the bracket [X,Y] := XY -—YX. 

Recall that an action of a group G ona set X isa map from Gx X to X, 
denoted (g, x) ++ g-x satisfying e- = x for all x € X and g-(h-x) = (gh)-« 
for all g,h € Gand x € X. A representation II of G on some vector space 
V gives rise to a linear action of G on V, given by g-v = II(g)v. (A linear 
action is an action for which the map v +> g-v is linear for each g.) Thus, 
we may use g- v as an alternative notation to II(g)v, when convenient. 


16.7.1 Finite-Dimensional Representations 


If G is a matrix Lie group, then G is already represented as a group of 
matrices. Nevertheless, it is of interest [as we will see in Chap. 17 in the 
case G' = SO(3)] to explore other representations of G. Since a matrix Lie 
group has a topological structure (inherited from M,,(C)), it is natural to 
require representations to be continuous. It is also simpler to deal at first 
with finite-dimensional representations, that is, those where the vector 
space in question is finite dimensional, although eventually we will need to 
consider infinite-dimensional representations as well. This discussion leads 
to the following definition. 


Definition 16.35 Let G Cc GL(n;C) be a matrix Lie group. A finite- 
dimensional representation of G is a continuous homomorphism of G 
into GL(V), the group of invertible linear transformations of a finite- 
dimensional vector space V. 


We will assume that all of our vector spaces are over the field C, even 
though it is occasionally of interest to consider also representations over R. 
The topology on GL(V) is defined by picking a basis, and thereby identifying 
the space of linear maps of V to V with ,,(C). We then use the subset 
topology on GL(V) = GL(n;C) C M,(C). This topology is easily seen to 
be independent of the choice of basis. 

An important example of representations in quantum theory arises from 
the time-independent Schrédinger equation in R”, namely the equation 
Hy = Eu, for a fixed constant E € R. If HH is invariant under rotations, 
then the space of solutions to this equation is invariant under rotations. 
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Note that an individual solution w to this equation may or may not be a 
rotationally invariant (i.e., radial) function. But if H is rotationally invari- 
ant, then rotating a solution to H w = Ew will give another solution of this 
equation. Even if the quantum Hilbert space is infinite dimensional, the 
solution spaces to Hy = Ew are typically finite dimensional and consti- 
tute finite dimensional representations of the group SO(n) of rotations. If 
we can understand what all possible finite-dimensional representations of 
SO(n) look like, we will have made a lot of progress in understanding solu- 
tions to Hip = Ew in the rotationally invariant case. This line of reasoning 
will be explored in detail in Chap. 18. 

We may consider as well finite-dimensional representations of Lie alge- 
bras. Assuming our Lie algebra g is finite dimensional (which is the only 
case we will consider in this chapter), there is no need to impose a re- 
quirement of continuity, since a linear map of one finite-dimensional real 
or complex vector space to another is automatically continuous. 


Definition 16.36 A finite-dimensional representation of a Lie algebra 
g is a Lie algebra homomorphism of g into gl(V), the space of all linear 
transformations of V. Here gl(V) is considered as a Lie algebra with bracket 
given by [X,Y] = XY -YX. 


We typically consider Lie algebras defined over the field R, since the Lie 
algebra of a matrix Lie group is in general only a real subspace of M,,(C). 
Nevertheless, it is convenient to consider vector spaces over C. If g is a 
real Lie algebra and V, and therefore also gl(V), is a complex vector space, 
then we require only that 7 : g — gl(V) be real linear, which is the only 
requirement that makes sense. 

In the interest of simplifying the terminology, we will sometimes speak 
of “a representation V,” without making explicit mention of the homomor- 
phism II or z. 


Definition 16.37 If Il: G — GL(V) is a representation of a matriz Lie 
group G, then a subspace W of V is called an invariant subspace if 
Il(g)w € W for allg € G and w € W. Similarly, if 7 : g > gl(V) ts 
a representation of a Lie algebra g, then a subspace W of V is called an 
invariant subspace if m(X)w € W for all X € g and w € W. A represen- 
tation of a group or Lie algebra is called arreducible if the only invariant 
subspaces are W = V and W = {0}. 


Definition 16.38 Jf (II,Vi) and (%,V2) are representations of a matrix 
Lie group G, a map ® : Vi — V2 is called an intertwining map (or 
morphism) if ®(II(g)v) = U(g)®(v) for all vu € Vi, with an analogous 
definition for intertwining maps of Lie algebra representations. If an in- 
tertwining map is an invertible linear map, it is called an isomorphism. 
Two representations are said to be isomorphic (or equivalent) if there 
exists an isomorphism between them. 
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In the “action” notation, the requirement on an intertwining map © is 
that ®(g-v) = g- &(v), meaning that ® commutes with the action of G. 
A typical goal of representation theory is to classify all finite-dimensional 
irreducible representations of G up to isomorphism. 

Given a representation II : G + GL(V) of a matrix Lie group G, we 
can identify GL(V) with GL(N;C) and gl(V) with gl(n;C) by picking a 
basis for V. We may then apply Theorem 16.23 to obtain a representation 
a7:g— gl(V) such that 


T(e*) = eml*) 
for all X € g. 


Proposition 16.39 Suppose G is a connected matrix Lie group with Lie 
algebra g. Suppose that Il: G > GL(V) its a finite-dimensional representa- 
tion of G and a: g > gl(V) is the associated Lie algebra representation. 
Then a subspace W of V is invariant under the action of G if and only if it 
is invariant under the action of g. In particular, Il is irreducible if and only 
if m is irreducible. Furthermore, two representations of G are isomorphic if 
and only if the associated Lie algebra representations are isomorphic. 


In general, given an representation 7 of g, there may be no representation 

II such that a and II are related in the usual way. If, however, G is simply 
connected, Theorem 16.30 tells us that there is, in fact, a II associated with 
every 7. 
Proof. Suppose W Cc V is invariant under 7(X) for all X € g. Then 
W is invariant under 7(X)™ for all m. Since V is finite dimensional, any 
subspace of it is automatically a closed subset and thus W is invariant 
under 





TI(e*) = e™(*) — y nm) : 
m: 
m=0 


Since G is connected, every element of G is (Corollary 16.28) a product 
of exponentials of elements of g, and so W is invariant under II(A) for all 
AeG. 

In the other direction, if W is invariant under II(A) for all A € G, then 
since W is closed, it is invariant under 
. AX _ I 
n(X) = fin, 
for all X € g. 

Now suppose II, and II, are two representations of G, acting on vector 
spaces V; and V2, respectively. If ® : V, — V2 is an invertible linear map, 
then an argument similar to the above shows ®I],(A) = Ig(A)® for all 
A € G if and only if ®7,(X) = m2(X)® for all X € g. Thus, © is an 
isomorphism of group representations if and only if it is an isomorphism of 
Lie algebra representations. 
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Theorem 16.40 (Schur’s Lemma) /fV; and V2 are two irreducible rep- 
resentations of a group or Lie algebra, then the following hold. 


1. If ®: Vi, > Vo is an intertwining map, then either ® = 0 or ® is an 
isomorphism. 


2. If B: Vi > Vo and VW: V, > V2 are nonzero intertwining maps, then 
there exists a nonzero constant c € C such that ® = cW. In particular, 
if ® is an intertwining map of Vi to itself then ® = cl. 


Although the first part of Schur’s lemma holds for representations over 

an arbitrary field, the second part holds only for representations over alge- 
braically closed fields. 
Proof. It is easy to see that ker ® is an invariant subspace of V,. Since 
V, is irreducible, this means that either ker ® = V,, in which case ® = 0, 
or ker® = {0}, in which case ® is injective. Similarly, the range of ® is 
invariant, and thus equal to either {0} or V2. If ® is not zero, then the 
range of ® is not zero, hence all of V2. Thus, if ® is not zero, it is both 
injective and surjective, establishing Point 1. 

For Point 2, since ® and W are nonzero, they are isomorphisms, by 
Point 1. It suffices to prove that [ := ®~'W is a multiple of the iden- 
tity, where [ is an intertwining map of V, to itself. Since we are work- 
ing over C, I must have at least one eigenvalue A. If W denotes the A- 
eigenspace of T, then W is invariant under the action of the group or Lie 
algebra. After all, if fw = Aw, then (in the notation of the group case) 
T(M(A)w) = W(A)lw = XAII(A)w. Since A is an eigenvector of T, the in- 
variant subspace W is nonzero and thus W = V,, which means precisely 
that [= AJ. @ 


16.7.2. Unitary Representations 


In quantum mechanics, we are interested not only in vector spaces, but, 
more specifically, in Hilbert spaces, since expectation values are defined in 
terms of an inner product. We wish to consider, then, actions of a group 
that preserve the inner product as well as the linear structure. Although 
the Hilbert spaces in quantum mechanics are generally infinite dimensional, 
we restrict our attention in this section to the finite-dimensional case. 


Definition 16.41 Suppose V is a finite-dimensional Hilbert space over C. 
Denote by U(V) the group of invertible linear transformations of V that pre- 
serve the inner product. A (finite-dimensional) unitary representation 
of a matrix Lie group G is a continuous homomorphism of Il: G— U(V), 
for some finite-dimensional Hilbert space V. 


Proposition 16.42 Let II: G > GL(V) be a finite-dimensional repre- 
sentation of a connected matrix Lie group G, and let be the associated 
representation of the Lie algebra g of G. Let (-,-) be an inner product on V. 
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Then II is unitary with respect to (-,-) if and only if r(X) is skew-self- 
adjoint with respect to (-,-) for all X € g, that is, if and only if 


1(X)* = —71(X) 
for all X € g. 


In a slight abuse of notation, we will refer to a representation 7 of a 
Lie algebra g on a finite-dimensional inner product space as unitary if 
m(X)* = —7(X) for all X € g. 

Proof. Suppose first that II(A) is unitary for all A € G. Then for all X € g 
and t € R we have 


Tie*}* _ ie) = Ife} = eo tM(X) 
On the other hand, 


T(e'* )* = (ne 2 ett(X)* 


Thus, 


ett (X)* = ett (X) 


for all t. Differentiating at t = 0 yields 7(X)* = —n(X). 
In the other direction, if 7(X)* = —a(X) for all X € g, then 


TI(e* )* = et (X)* = eo t(X) = I(e~*) _ fife" y=", 


meaning that H(e*) is unitary. Since G is connected, Corollary 16.28 tells 
us that each element A of G is expressible as a product of exponentials, 
from which it follows that II(A) is unitary. 


16.7.8 Projective Unitary Representations 


In quantum mechanics, two unit vectors in the quantum Hilbert space that 
differ by multiplication by a constant are considered to represent the same 
physical state. Thus, an operator of the form e’’J, with 6 € R, will act as the 
identity at the level of the physical states. Suppose that V is a Hilbert space 
over C, assumed for the moment to be finite dimensional. Then it is natural 
to consider homomorphisms not into U(V) but rather into the quotient 
group U(V)/{e’’ I}. Of course, given a homomorphism I of G into U(V), 
we can always turn II into a homomorphism of G into the quotient group, 
just by composing IT with the quotient map. Not every homomorphism into 
the quotient group, however, arises from a homomorphism into U(V). 


Definition 16.43 Suppose V is a finite-dimensional Hilbert space over C. 
Then the projective unitary group over V, denoted PU(V), is the quo- 
tient group 


PU(V) = U(V)/f{e"" J}, 
where {e’°I} denotes the group of matrices of the form e'°I, 0 € R. 
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Note that {eI} is a closed normal subgroup of U(V). Now, U(V) is 
(isomorphic to) a matrix Lie group, since we can identify it with U(n) by 
picking an orthonormal basis for V. In general, the quotient of a matrix 
Lie group by a closed normal subgroup may not be a matrix Lie group. In 
this case, however, it is not hard to realize the quotient U(n)/{e!J} as a 
matrix Lie group. 


Proposition 16.44 IfV is a finite-dimensional Hilbert space over C, then 
PU(V) is isomorphic to a matrix Lie group. 

Let Q : U(V) > PU(V) be the quotient homomorphism and let q : 
u(V) — pu(V) be the associated Lie algebra homomorphism. Then q maps 
u(V) onto pu(V) and the kernel of q is the space of matrices of the form 
ial witha € R. Thus, pu(V) is isomorphic to u(V)/{ial}. 


The Lie algebra u(V) of U(V) is the space of skew-self-adjoint operators 
on V. In Proposition 16.44, the space {iaI} is an ideal in u(V) and the 
quotient is in the sense of Lie algebras over R; see Exercise 9. If dimV = N, 
then it is not hard to see that the Lie algebra pu(V) = u(V)/{ial} is 
isomorphic to the Lie algebra su(N). The group PU(V) is not, however, 
isomorphic to the group SU(N). See Exercise 16. 

Proof. If dimV = N, then gl(V), the space of all linear maps of V to V, 
has dimension N?. Given U € U(V), we can define 


Cu : el(V) > gl(V) 


by 


Co(X)= 0x0". 


(That is to say, Cy is conjugation by U.) Note that (Cy)~! = Cy-1 and 
Cuv = CyCy. Thus, C (ie., the map U ++ Cy) is a homomorphism of 
U(V) into GL(gl(V)), and this homomorphism is clearly continuous. If U 
is a multiple of the identity, then Cy is the identity operator on gl(V). 
Conversely, if Cy is the identity, then UX = XU for all X € gl(V), which 
implies (Exercise 18) that U is a multiple of the identity. Thus, the kernel 
of C consists precisely of those scalar multiples of the identity that are in 
U(V); that is, kerC = {eI}. 

We have constructed, then, a homomorphism of U(V) into GL(gl(V)) = 
GL(N?;C) with a kernel that is precisely {e’?7}. The image of U(V) un- 
der this homomorphism is, therefore, isomorphic to the quotient group 
U(V)/{eI}. Furthermore, since U(V) is compact, the image of U(V) un- 
der C is compact and thus closed. This image is, then, a matrix Lie group 
isomorphic to PU(V). 
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Let c be the associated Lie algebra homomorphism associated with the 
homomorphism C. Using Point 3 of Theorem 16.23, we may calculate that 


d 
cx(Y) = oer 
=xXY-YX 


= [X,Y]. 


t=0 


Using Exercise 18 again, we see that cx = 0 if and only if X is a multiple 
of the identity. Thus, the kernel of c consists of all the scalar multiples of 
I in u(V), namely {ial}. 

Now, the image of U(V) under C is (isomorphic to) PU(V); in particular, 
C maps U(V) onto PU(V). It follows that c must map u(V) onto pu(V). 
(This claim follows from Theorem 3.15 in [21].) Thus, pu(V) = u(V)/{ial}. 
7 


Definition 16.45 A finite-dimensional projective unitary representa- 
tion of a matrix Lie group G is a continuous homomorphism II of G into 
PU(V), where V is a finite-dimensional Hilbert space over C. A subspace 
W of V is said to be invariant under II if for each A € G, W is invariant 
under U for every U € U(V) such that [U] = II(A). A projective unitary 
representation (II, V) is irreducible if the only invariant subspaces are {0} 
and V. 


Given an ordinary unitary representation, © : G > U(V), we can always 
form a projective representation, II: G — PU(V), simply by setting HT = 
Qo. Not every projective representation, however, arises in this fashion. 
Thus, considering projective representations gives us more flexibility than 
considering ordinary unitary representations. 


Proposition 16.46 Let II: G— PU(V) be a finite-dimensional projective 
unitary representation of a matriz Lie group G, and let 7: g > pu(V) be 
the associated Lie algebra homomorphism. Then there exists a Lie algebra 
homomorphism o : g + u(V) such that 7(X) = q(o(X)) for all X € g. 
It is possible to choose o so that trace(a(X)) = 0 for all X € g, and o is 
unique if we require this condition. 


That is to say, every finite-dimensional projective representation can be 
“de-projectivized” at the Lie algebra level. In general, o is not unique, 
because there may be o’s for which trace(o(X)) is nonzero for some X. 
On the other hand, if g has the property that every X € g is a linear 
combination of commutators—which is true if g = so(3)—then a is unique. 
See Exercise 15. 

Proof. Recall that pu(V) = u(V)/{ialI}. That is, for each X € g, 7(X) 
denotes a whole family of operator that differ by adding ial. If Y € u(n) 
is any representative of 7(X), then since Y* = —Y, the trace of Y will 
be pure imaginary. Thus, there is a unique pure-imaginary constant c = 
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—trace(Y)/dimV such that the trace of Y + cI is zero. Let us then set 
o(X) = Y + cl. Since zm is a Lie algebra homomorphism, o([X,Y]) will 
equal [a(X),a(Y)] + ial, for some a € R. Since trace(o([X,Y])) = 0 by 
construction and since the commutator of any two matrices has trace zero, 
we see that actually a = 0. Thus, ao as in the proposition exists, and it is 
unique if we require that o(X) have trace zero. 


Theorem 16.47 Suppose G is a matrix Lie group and G is a universal 
cover of G, with covering map ®. Then the following hold. 


1. Let Il: G + PU(V) be a finite-dimensional projective unitary rep- 
resentation of G. Then there is an ordinary unitary representation 
x:G— U(V) of G such that lo ® = Qod. Any such ¥ is irre- 
ducible if and only if II is irreducible. It is possible to choose Xi so 
that det(%(A)) =1 for all A € G, and® is unique if we require this 
condition. 


2. Let % be a finite-dimensional irreducible unitary representation of iG. 
Then the kernel of the associated projective unitary representation 
Qo contains the kernel of the covering map ®. Thus, Qo» factors 
through G and gives rise to a projective unitary representation of G. 


In the finite-dimensional case, then, there is a one-to-one correspondence 
between irreducible projective unitary representations of G and irreducible, 
determinant-one ordinary unitary representations of G. Point 1 of the the- 
orem means that any finite-dimensional projective unitary representation 
of the group G can be “de-projectivized” at the expense of passing to the 
universal cover G of G. 

Note that Theorem 16.47 applies only to finite-dimensional projective 
unitary representations. Example 16.56 will provide an infinite-dimensional 
example in which Point 1 of the theorem fails. 

Proof. If g is the Lie algebra of G, Proposition 16.46 tells us that we can 
find an ordinary representation o : g + u(V) such that goo = 7. We then 
define a representation & : § > u(V) of the Lie algebra g of G by setting 
a(X) = 0(¢(X)), X € g. Since G is simply connected, we can then find 
a unique representation © : G — U(V) such that S(e*) = e?™) for all 
X € g. Since 

qoo=qoc0h=T0¢, 

it follows that QoX = Io®. Furthermore, if © maps into SU(V), ¢ = godt 
maps into su(n). This condition uniquely determines o and thus also o and 
yu, establishing Point 1 of the theorem. 

For Point 2, observe that ker ® is a discrete normal subgroup of G, which 
is therefore central (Exercises 1 and 12). Thus, for all A € ker ©, we have 


5(A)D(B) = X(AB) = X(BA) = 5(B)=(A) 
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for all B € G. That is to say, D(A) is an intertwining map of V to itself. 
Since V is also irreducible as a representation of G, Schur’s lemma tells us 
that &(A) = cI, where |c| = 1 because ©(A) € U(V). Thus, A is in the 
kernel of the associated projective representation Qo Ui. m 


16.8 New Representations from Old 


In this section, we consider three basic mechanisms for combining repre- 
sentations to produce new representations: direct sums, tensor products, 
and duals. This section assumes familiarity with these notions at the level 
of vector spaces; a brief review is provided in Appendix A.1. 


Definition 16.48 Suppose (Il1,Vi) and (Ilz, V2) are representations of a 
matrix Lie group G. The direct sum of these two representations is the 
representation Il; ® Ig : G > GL(V; @ V2) given by 


(II, 6 I2)(A) = 1 (A) 6 Ila (A). 


The tensor product of Il, and Iz is the representation I], ® Ig: G- 
GL(V, ® V2) given by 


(Il, @ IIz)(A) = Ih(A) ® Te(A). 


Finally, the dual of Il, is the representation IIt” : G — GL(V*) given by 


TI" (A) = Th (A71)" = (1 (A)*") 7. 


Similarly, the direct sum, tensor product, and dual of Lie algebra repre- 
sentations can be defined by 


(71 © m2)(X) = m1(X) G m2(X) 
(71 ® T2)(X) = m4 (X) @L +18 m(X) 
mi" (X) = —m(X)". 


It is important to note the differences in formulas between the group and 
the Lie algebra in the case of tensor products and dual representations. It 
is easy to motivate the definitions for the Lie algebra: If G acts on Vi ® V3 
by II, (A) ® Ty(A), then the associated Lie algebra action will be given by 


d 


qinte™) @Tyg(e*)| =m (X) @I4+1.@ 10(X). 


t=0 


Of course, we continue to use this last formula for tensor products of Lie 
algebra representations, even if there is no associated group representations. 
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Remark 16.49 Jf (I,,Vi) and (Ie, V2) are representations of a group G, 
it is possible to view Vi ® V2 as a representation of the direct product group 
G x G, by setting 


(I, ® Hg)(A, B) = Ih (A) @ Ia(B). 


Similarly, if (71,Vi) and (m2,V2) are representations of a Lie algebra g, it 
is possible to view Vi © V2 as a representation of g@g by setting 


(7 @ m2)(X,Y) => 1™1(X) @I1+I1@7m(Y). 


Nevertheless, it is, in most cases, more natural to view V; ® Vo as a 
representation of G itself, rather than of G x G. Even if Vij and V2 are 
irreducible representations of G, the space V; ® V2 will in most cases fail 
to be irreducible as a representation of G. If, for example, we take Vj = 
V2 = V, then the space of symmetric tensors inside V ® V will form a 
nontrivial invariant subspace, unless dim V = 1. An important problem in 
representation theory is to decompose V; ® V2 as a direct sum of irreducible 
representations, where V; and V3 are irreducible representations of a fixed 
group or Lie algebra. In the case of the Lie algebra su(2), this decomposition 
is discussed in Sect. 17.9. 


Definition 16.50 A finite-dimensional representation of a group or Lie 
algebra is said to be completely reducible if it is isomorphic to a direct 
sum of irreducible representations. 


Proposition 16.51 Every finite-dimensional unitary representation of a 
group or Lie algebra is completely reducible. 


Proof. Suppose (II, V) is a unitary representation of a matrix Lie group G. 
If W is a subspace of V invariant under each II(A), then W+ is invariant 
under each II(A)*, as the reader may easily verify. But since II is unitary, 


Thus, W + is invariant under II(A~!) for all A € G, hence under II(A) for all 
A € G. We conclude that, in the unitary case, the orthogonal complement 
of an invariant subspace is always invariant. 

If V is irreducible, there is nothing to prove. If not, we pick a nontrivial 
invariant subspace W and decompose V as W 6 W +. The restriction of II 
to W or to W+ is again a unitary representation, so we can repeat this 
procedure for each of these subspaces. Since V is finite dimensional, the 
process must eventually terminate, yielding an orthogonal decomposition 
of V as a direct sum of irreducible invariant subspaces. 

If we consider a unitary representation 7 of a Lie algebra g, we have 
the same argument, but with the identity II(A)* = II(A~') replaced by 
1(X)* = —7(X). 


360 16. Lie Groups, Lie Algebras, and Representations 


Proposition 16.52 Suppose K is a compact matrix Lie group. For any 
finite-dimensional representation (II, V) of K, there exists an inner product 
on V such that II(A) is unitary for all A € G. In particular, every finite- 
dimensional representation of K is completely reducible. 


See Proposition 4.36 in [21]. 


16.9 Infinite-Dimensional Unitary Representations 


For the applications we have in mind, we need to consider representa- 
tions that are infinite dimensional. The theory of such representations is 
inevitably more complicated than that of finite-dimensional representa- 
tions. For our purposes, it suffices to consider the nicest sort of infinite- 
dimensional representations—unitary representations in a Hilbert space. 


16.9.1 Ordinary Unitary Representations 


We begin by considering ordinary representations and then turn to projec- 
tive representations. 


Definition 16.53 Suppose G is a matrix Lie group. Then a unitary rep- 
resentation of G is a strongly continuous homomorphism II : G > U(H), 
where H is a separable Hilbert space and U(H) is the group of unitary op- 
erators on H. Here, strong continuity of Il means that if a sequence Am, in 
G converges to A € G, then 


‘tim_||T1(Am)e — T1(A)¥] = 0 
for allw © H. 


We can attempt to associate to a unitary representation II of G some 
sort of representation 7 of the Lie algebra g of G, by imitating the con- 
struction in Theorem 16.23. For any X € g, the map ¢t 4 II(e’*) is a 
strongly continuous one-parameter unitary group. Thus, Stone’s theorem 
(Theorem 10.15) tells us that there exists a unique self-adjoint operator A 
such that H(e’*) = e4 for all t € R. If we let 1(X) denote the skew-self- 
adjoint operator 7A, we will have 


Ti(e*) = ef), (16.5) 


The operators 7(X), X € g, are in general unbounded and defined only 
on a dense subspace of H. Nevertheless, it can be shown (see, e.g., [43]) 
that there exists a dense subspace V of H contained in the domain of 
each 7(X) and that is invariant under each 7(X), and on which we have 
m([X, Y]) = [7(X), 7(Y)]. In the case of the particular representation that 
we will consider in the next chapter, we can avoid these difficulties by 
looking at finite-dimensional invariant subspaces. 
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Proposition 16.54 Suppose G is a matrix Lie group and IL: G — U(H) is 
a unitary representation of G. For each X € g, let x(X) denote the operator 
in (16.5). Suppose V C Hi is a finite-dimensional subspace of H such that 
II(A) maps V into V, for all Ae G. Then for all X € g, V C Dom(7(X)), 
mt(X) maps V into V, and we have 


m([X, Y])v = [r(X), a(Y)]v (16.6) 


for allv € V. 

In the other direction, suppose G is connected and suppose V is any 
finite-dimensional subspace of H such that for all X € g, V C Dom(a(X)) 
and m(X) maps V into V. Then II(A) also maps V into V, for all AE G. 


Proof. Since V is invariant under both II(A) and II(A)* = II(A7?), the 
restriction to V of each II(A) is unitary. The operators II(A)|,, form a 
finite-dimensional unitary representation of G that is strongly continuous 
and thus continuous. (In the finite-dimensional case, all reasonable notions 
of continuity for representations coincide.) For each X € g, Theorem 16.18 
tells us that there is an operator X on V such that 


Hie) |. = el*. 


Thus, for any v € V, we have 


t0 t t0 t 


This calculation shows that v is in the domain of the infinitesimal gener- 
ator 7(X) of the unitary group II(et*), and that 7(X)v = Xv. Since the 
operators X, X € g, form a representation of g, we have the relation (16.6). 

In the other direction, if V is invariant under 7(X), the restriction of 
m(X) to V is automatically bounded. Thus, there is a constant C' such that 


Ilr(X)"ol] < C™ [ol (16.7) 


for all v € V. If we use the direct-integral form of the spectral theorem 
for the self-adjoint operator A := —im(X), it is easy to see that (16.7) can 
only hold if v, viewed as an element of the direct integral, is supported on 
a bounded interval inside the spectrum of A. Since the power series of the 
function \ ++ e” converges to e uniformly on any finite interval, we will 
have 





; co t™1(X)™ 
tx _ pitA,, 
II(e"* )v = ev yy a 


Each term in the above power series belongs to V, which is finite dimen- 
sional and thus closed. We conclude that II(e’*)v belongs to V for all 
X € g. Since G is connected, each element of G is a product of exponen- 
tials of Lie algebra elements, and we have the claim. @ 
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16.9.2 Projective Unitary Representations 


Given a Hilbert space H, let SP denote the unit sphere in H, that is, the 
set of vectors with norm 1. Let PH be the quotient space (S#)/ ~, where 
“~” denotes the equivalence relation in which u ~ v if and only if u = ev 
for some 6 € R. The quotient map q : S# —» PH induces a topology on 
PH in which a set U C PH is open if and only if g~'(U) is open as a 
subset of the metric space SH C H. 


As in the finite-dimensional case, we can form the quotient group 
PU(H) := U(H)/{e'7J}. 


The action of U(H) on S™ descends to a well-defined action of PU(H) 
on PH. 


Definition 16.55 A projective unitary representation of a matrix Lie 
group G is a homomorphism Il: G — PU(H), for some Hilbert space H, 
with the property that if a sequence A, in G converges to A in G, then 


TI(Ap,)x > H(A): 
for all x € PH. 


Recall that in the finite-dimensional case, every projective unitary rep- 
resentation of G can be “de-projectivized” at the expense of possibly having 
to pass to the universal cover G of G (Theorem 16.47). The 
de-projectivization proceeds by passing to the Lie algebra, choosing the 
trace-zero representative of each equivalence class, and then exponentiat- 
ing back to the universal cover of the original group. This approach does 
not work in the infinite-dimensional case. After all, even assuming we can 
construct a Lie algebra homomorphism 7(X) for each X € g, the repre- 
sentatives of 7(X) are typically unbounded operators on H, for which the 
notion of trace does not make sense. This difficulty is not just a technical- 
ity; the corresponding result in the infinite-dimensional case is false, as we 
will now see. 


Example 16.56 For all (a,b) € R?, define an operator Tia») on L?(R) by 


(Tra py) (z) = eb (a — 5). 
Then Ta) is unitary for all (a,b) € R? and we have 


(T(a,0)Ta'vy) (2) = eft ia’ (@—Daj(x — (b+ Y)) 
= e tab (Tia+a’,b+b")) (x). (16.8) 
The map (a,b) ++ [Ia] 1s @ homomorphism of R? into PU(L?(R)), and 
this homomorphism is continuous in the sense of Definition 16.55. There 


does not, however, exist any homomorphism S : R? + U(L?(R)) such that 
[S(a,v)] = [Tta,0)] for all (a,b) € R?. 
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Thus, even though R? is simply connected (and thus its own universal 

cover), there is no way to de-projectivize the projective unitary represen- 
tation (a,b) + [Tay] of R?. 
Proof. The map (a,b) — Tia») is easily seen to be strongly continuous, 
and thus the map (a,b) ++ [T(q4)] is continuous in the sense of Defini- 
tion 16.55. If a homomorphism S$ with the indicated properties existed, 
then there would be constants 0a,» such that S(ap) = eat T a By: But then 
since S$ is a homomorphism from the commutative group R? into U(L?(R)), 
the operator S(q.») would have to commute with S(q/) for all (a,b) and 
(a’,b’). But then the operators T(,,,) and T(q’y), being constant multiples 
of commuting operators, would need to commute as well. But this is not the 
case; for example, T(q,9) does not commute with T(0,p/), as is easily verified 
using (16.8). m 

Despite the negative result in Example 16.56, there is a positive result in 
this direction: If G is connected and “semi-simple,” every projective unitary 
representation of G can be de-projectivized after passing to the universal 
cover. Here, a Lie algebra g is said to be simple if g has no nontrivial ideals 
and dim g > 2. A Lie algebra is said to be semi-simple if it is a direct sum 
of simple algebras. Finally, a Lie group G is said to be semi-simple if the 
Lie algebra g of G is semi-simple. 

For any connected Lie group G, a projective unitary representation IT of 
G can be de-projectivized by passing to a one-dimensional central exten- 
sion. A one-dimensional central extension of G is a Lie group G’ together 
with a surjective homomorphism ® : G’ — G such that the kernel of ® is 
one-dimensional and contained in the center of G’. See the article [1] of V. 
Bargmann for more information about these issues. 


16.10 Exercises 


1. Suppose that G is a connected matrix Lie group and that N is a 
discrete normal subgroup of G, meaning that there is some neighbor- 
hood U of J in G such that UM N = {I}. Show that N is contained 
in the center of G. 


Hint: Consider the quantity gng~' for g € G andneé N. 


2. (a) Suppose two elements U and V of SU(2) commute. Show that 
each eigenspace for U is invariant under V and vice versa. 


(b) Show that if U is in the center of SU(2), then U = I or U = —I. 


3. Define the Hilbert-Schmidt norm of a matrix X € M,,(C) by the 


formula 
2; 2 
IXlas = >> |Xpal’. 
jk=1 
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Using the Cauchy—Schwarz inequality, show that 
XV lus < |X llas ll¥ las (16.9) 


for all X,Y € M,(C). 


. Using term-by-term differentiation of power series, show that for all 


X € M,(C) and all 1 < j,k <n, we have 
d 


dt [(e*) 5 





= X jh. 
t=0 a 


. Verify Property 4 of Theorem 16.15. This should be easy in the case 


that X is diagonalizable. In the general case, either use the Jordan 
canonical form or appeal to the fact that diagonalizable matrices are 
dense in M,,(C). 


. Suppose X and Y are commuting n x n matrices. Show that 


eX en = gore . 


This is Property 5 of Theorem 16.15. 


Hint: Multiply together the power series for eX and e” and then 
group terms where the total power of X and Y is n. 


. For A € M,,(C), define the logarithm of A by the power series 





(4-1)? , (A-1} 


log A=A-TI 5 5 


whenever this series converges. Assume the following result: If A is 
sufficiently close to J, then log A is defined and exp(log A) = A. 
[This can be seen easily when A is diagonalizable, and the set of 
diagonalizable matrices is dense in M,,(C).] 


(a) Show that there exists a constant C' such that for all A with 
||A — I|| < 1/2 we have 


lllog A (A-D|| <CA-TIP. 


(b) Show that for all X,Y € M,(C) we have 


x yY 1 
log (eX/me¥/m) — 2. -o( :) (16.10) 


m mm 





Note that e*/™e¥/™ tends to I as m tends to infinity, so that 
the left-hand side of (16.10) is defined for all sufficiently large m. 


(c) Prove the Lie Product Formula. 
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8. (a) Show that for all X,Y € M,,(C), 


d a ae 
[gare < mIX|" [IY I. 


dt 











t=0 


(b) Show that the map X +> e!* is a continuously differentiable 
map of M,,(C) = R2”’ to itself. 

(c) Using Exercise 4, show that the differential of the map X 4 e* 
at X = 0 is the identity map of M,,(C) to itself. (Recall that the 
differential of smooth map of RJ to R*, evaluated at a point in 
R/, is a linear map of R’ to R*.) 


9. Suppose g is a Lie algebra and is an ideal in g. Let g/h denote the 
vector space quotient of g by 5. Show that the bracket on g descends 
unambiguously to a bilinear map on g/b, and that g/h forms a Lie 
algebra under this map. 


10. Suppose that G1, Gz, and G3 are matrix Lie groups with Lie algebras 
G1, G2, and gs, respectively. Suppose that ® : G; > Gp and W : 
G2 — G3 are Lie group homomorphisms with associated Lie algebra 
homomorphisms @ and w, respectively. Show that the Lie algebra 
homomorphism associated to Vo ®: G; > G3 is Wo. 


11. Show that isomorphic matrix Lie groups have isomorphic Lie alge- 
bras. 


12. Suppose G, and G2 are matrix Lie groups with Lie algebras g; and 
G2, respectively. Suppose ® : Gj > Gp is a Lie group homomorphism 
with the property that the associated Lie algebra homomorphism 
@: 91 — ge is injective. Show that there exists a neighborhood U of 
the identity in Gy such that UN ker ® = {J}. 


Hint: Use Theorem 16.25. 


13. (a) Show that every R € SO(3) has an eigenvalue of 1. 


(b) Show that every R € SO(3) is conjugate in SO(3) to matrix of 


the form 
1 0 0 


0 cosOé —sin@ 
0 sin@  cosé 


for some 0 € R. 
(c) Show that the exponential map from so(3) to SO(3) is surjective. 
(d) Show that SO(3) is connected. 


14. Show that the center of SO(3) is trivial. 
Hint: Use Part (a) of Exercise 13. 
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15. 


16. 


17. 


18. 


19. 
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Given a Lie algebra g, let [g, g] denote the space of linear combinations 
of commutators, that is, the space spanned by elements of the form 
[X,Y] with X,Y €g. 


(a) Show that [g,g] is an ideal in g and that the quotient g/|g, g] 
is commutative. (The ideal [g, g] is called the commutator ideal 


of g.) 
If g = so(3), show that [g, 9] = g. 


——a~ wD 
Qa oO 
Se 


If t : g > gl(V) is any finite-dimensional representation of g, 
show that 7([g,g]) is contained in sl(V), the space of endomor- 
phisms of V with trace zero. 


Show that the Lie algebra pu(n) = u(n)/{iaR} is isomorphic to 
the Lie algebra su(n). 


— 
© 
ae 


Let {e27"*/"T} denote the group of matrices that are of the form 
of an nth root of unity times the identity. Show that the group 
PU(n) is isomorphic to SU(n)/{e27**/"T}. 


— 
oa 
wa 


Suppose that G is a matrix Lie group with Lie algebra g and that 
A is an element of G. Show that the operation of left multiplication 
by A7? is a diffeomorphism of M,,(C). Now show that there exist 
neighborhoods U of 0 in M,,(C) and V of A in M,,(C) such that the 
map X +> Ae* maps U diffeomorphically onto V and such that for 
X €U, we have X € g if and only if Ae* € V. (Use Theorem 16.25.) 


Suppose that Z € M,,(C) has the property that ZX = XZ for all 
X € M,,(C). Show that Z = cI for some c € C. 


Suppose (II,H) is a unitary representation of a matrix Lie group 
G, and suppose V; and V4 are finite-dimensional irreducible invari- 
ant subspaces of H. Show that if V; and V2 are not isomorphic as 
representations of G, then V; is orthogonal to V2 inside H. 


Hint: Show that the orthogonal projection of H onto V; or V2 is an 
intertwining map, and use Schur’s lemma. 


17 


Angular Momentum and Spin 


17.1 The Role of Angular Momentum 
in Quantum Mechanics 


Classically, angular momentum may be thought of as the Hamiltonian 
generator of rotations (Proposition 2.30). Angular momentum is a particu- 
larly useful concept when a system has rotational symmetry, since in that 
case the angular momentum is a conserved quantity (Proposition 2.18). 
Quantum mechanically, angular momentum is still the “generator” of ro- 
tations, meaning that it is the infinitesimal generator of a one-parameter 
group of unitary rotation operators, in the sense of Stone’s theorem (The- 
orem 10.15). The quantum angular momentum is again conserved in sys- 
tems with rotational symmetry. This means that if the Hamiltonian Hf is 
invariant under rotations, then H commutes with the angular momentum 
operators, in which case, the angular momentum operators are constants 
of motion in the quantum mechanical sense. 

The various components of the classical angular momentum vector for 
a particle in R? satisfy certain simple commutation relations under the 
Poisson bracket (Exercise 19 in Chap. 2). We will see that those relations are 
the commutation relations for the Lie algebra so(3) of the rotation group 
SO(3). If H commutes with each component of the angular momentum, 
each eigenspace for H (the solution space to Hw = Xv for a given d) is 
invariant under the angular momentum operators. Thus, the eigenspace 
constitutes a representation of the Lie algebra so(3). By classifying the 
irreducible (finite-dimensional) representations of so(3), we can obtain a lot 
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of information about the structure of the solution spaces to the equation 
H w = Av, in the case that H is invariant under rotations. Specifically, the 
representation theory of so(3) allows us to determine completely the angular 
dependence of a solution (x), leaving only the radial dependence of w) to 
be determined. This has the effect of reducing the number of independent 
variables from three to one (just the radius r in polar coordinates), thereby 
reducing the problem to solving an ordinary differential equation. 
Understanding angular momentum from the point of view of representa- 
tions of a Lie algebra also prepares us to understand the concept of spin. 
The Hilbert space for a particle in R® with spin is the tensor product 
of L?(R°) with a finite-dimensional vector space V, where V carries an 
irreducible action of the rotation group SO(3). In this setting, the proper 
notion of “action” is a projective representation of SO(3), meaning a family 
of operators satisfying the relations of SO(3) up to phase factors (constants 
of absolute value one). These phase factors are permitted because, physi- 
cally, two vectors that differ only by a constant represent the same physical 
state. By Proposition 16.46, every projective representation of SO(3) can 
be de-projectivized at the level of the Lie algebra so(3). Conversely, every 
irreducible ordinary representation of the Lie algebra so(3) gives rise to a 
representation of the universal cover SU(2) of SO(3), which in turn gives 
rise (Theorem 16.47) to a projective representation of SO(3). Thus, the 
possibilities for the space V are in one-to-one correspondence with the irre- 
ducible representations of the Lie algebra so(3). In the case of “half-integer 
spin,” the space V does not carry an ordinary representation of the group 


S0(3). 


17.2 The Angular Momentum Operators in R® 


Recall from Sect. 2.4 that the classical angular momentum for a particle in 
R° is given by J = x x p, so that, say, J3 = 21 po — r2p,. As in Sect. 3.10, 
we introduce the quantum mechanical counterpart, a “vector” J with com- 
ponents that are operators, 


J=X~xP. 


Thus, for example, J; = X2P3 — X3P 2. Note that each component of the 
angular momentum involves products of distinct components of the po- 
sition and momentum operators X and P, which commute. Thus, in the 
expression for, say, Se it does not matter whether we write X2P3 or P3X9. 

The angular momentum operators are unbounded operators and are de- 
fined only on a dense subspace of L?(R*). For the moment, we will not 
specify the domain of these operators, leaving that until the next section. 
We will see, however, that the domain of each angular momentum operator 
contains the Schwartz space S(R?) (Definition A.15). 
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As in Exercise 10 in Chap.3, we can use the canonical commutation 
relations to obtain [J,, J] = ihJ3. We may similarly compute [J2, J3] and 
(J, Jz] to obtain the complete set of commutation relations among the .J’s: 


ll oa 8 . 
—|J3, Ji| = Jo. 
; al 3, J1] 2 
These relations compare well with the Poisson bracket relations among the 
various components of the classical angular momentum vector (Exercise 19 
in Chap. 2). 

Writing out J3 explicitly, we have 


i eae a sc x 
pli dal = Jai pp la Ja] = Jai 





(Jw) (x) = —if (a rp =) w(x) (17.1) 


; (17.2) 
6=0 





where Rg denotes a counterclockwise rotation by angle 6 in the (a, x2) 
plane, with similar expression for J; and J. This description of the angu- 
lar momentum operators demonstrates that they—like the components of 
the classical angular momentum—are closely connected to rotations (recall 
Propositions 2.18 and 2.30). The connection between angular momentum 
and rotations will be made more explicit in the following sections by recog- 
nizing that they make up the Lie algebra action associated with the natural 
action of the rotation group on L?(R?). 

We may define a new version of the angular momentum operators Jj, 
given by 

J; = oy (17.3) 

Since Planck’s constant and angular momentum have the same units, the 
Jj’s do not depend on the choice of units; we refer to them as the dimen- 
stonless versions of the angular momentum operators. 


17.3. Angular Momentum from the Lie Algebra 
Point of View 


We begin this section by looking at the natural action of the rotation group 

SO(3) on L?(R). 

Definition 17.1 For each R € SO(3), define II(R) : L?(IR?) + L?(R°) by 
(II(R)v)(x) = Y(R“ 2). (17.4) 


Proposition 17.2 For each R € SO(3), the map II(R) : L?(IR?) > L?(R) 
is unitary. Furthermore, the map II : SO(3) + U(L?(R%)) is a strongly 
continuous homomorphism. 
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Proof. Since the Lebesgue measure on R® is invariant under rotations, 
II(R) is unitary for all R € SO(38). It is easily checked that II(RiR2) = 
II(R1)(Rz); for this to be true, we need to have y(R~'x) rather than 
(Ra) in the definition of II(R). Arguing as in the proof of Example 10.12, 
we can easily verify that II is strongly continuous. @ 

Recall the computation of the Lie algebra so(3) of SO(3) in Sect. 16.5, 
and the basis {F\, F2, F3} for so(3) in (16.2) in that section. 


Proposition 17.3 For each X € so(3), let m(X) denote the skew-self- 
adjoint operator such that 


Ti(e**) = ef), (17.5) 


Then the domain of each n(F;) contains the Schwartz space S(R*) and on 
S(R?) we have the relation 


J; = ihn(F;). 


In the notation of Stone’s theorem (Theorem 10.15), the operator 7(X) 

in (17.5) is 7 times the infinitesimal generator of the one-parameter unitary 
group t+ II(e’*),. 
Proof. In the case of J3, we compute as in Example 16.16 that e’¥3 is a 
counterclockwise rotation in the (x1, x2)-plane. If 7 belongs to S(R*) then 
the limit defining the derivative in (17.2) is easily seen to hold in the L? 
sense. Thus, recalling the inverse on the right-hand side of (17.4), we see 
that J3 coincides with ih(F3), as claimed. Similar calculations apply to 
I and dios | 

Although it is not easy to determine the precise domain of each angular 
momentum operator, we can see from Proposition 16.54 that if ~ belongs 
to a finite-dimensional subspace of L?(R*) that is invariant under rotations, 
then w~ belongs to the domain of each di 


17.4 The Irreducible Representations of so(3) 


In this section, we classify the irreducible finite-dimensional representations 
of the Lie algebra so(3), up to isomorphism. (See Sect. 16.7 for the defini- 
tions and elementary properties of representations.) All representations are 
taken over the field of complex numbers and assumed to have dimension 
at least one. We continue to use the basis {F\, F2, F3} for so(3) in (16.2). 


Theorem 17.4 Let m : so(3) > gl(V) be a finite-dimensional irreducible 
representation of so(3). Define operators Lt, L~, and L3 on V by 


Lt =in(F,) — (Fo) 
Le = in(F\) + 1 (£2) 
D3 = in(F3). 
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Let | = 4 (dim V — 1), so that dimV = 21+ 1. Then there exists a basis 
U9, U1,--., Va. of V such that 


L3v; = (1 — j)v; 


a Uj+1 if j < 2l 
Lu; = { ne (17.6) 
Lty, —d JA+1-a)j-1 fF > 0 
i 0 vi=0 


Thus, the quantity / completely determines the structure of an irreducible 
representation of so(3). Since dim V is a positive integer, / has to have one 
of the following values: 


3 
is aes VOT 
a (17.7) 
The proof of Theorem 17.4 is given later in this section. 


Definition 17.5 If (z,V) is an irreducible finite-dimensional representa- 
tion of so(3), then the spin of (1, V) is the largest eigenvalue of the operator 
L3 := in(F3). Equivalently, | is the unique number such that dim V = 2I1+1. 


Our next result says that all the values of / in (17.7) actually arise as 
spins of irreducible representations of so(3). 
Theorem 17.6 For any 1 =0, $, 1, 3. ... there exists an irreducible repre- 
sentation of so(3) of dimension 21+1, and any two irreducible representa- 
tions of so(3) of dimension 21+ 1 are isomorphic. 


Note that the theorem is only asserting the existence, for each I, of a 
representation of the Lie algebra so(3). As we will see in the next section, 
an irreducible representation 7 of so(3) comes from a representation I of 
SO(3) if and only if J is an integer. Nevertheless, the representations of 
so(3) with half-integer values of I—the ones where / is half of an integer 
but not an integer—still play an important role in quantum physics, as 
discussed in Sect. 17.8. (Although it would be clearer to refer to the case 
l= 1/2,3/2,5/2,...as “integer plus a half,” the terminology “half-integer” 
is firmly established.) 

By comparison to Proposition 17.3, we may think of D3 as the analog 
of the third component of the dimensionless angular momentum operator 
on the space V. Indeed, we will eventually be interested in applying Theo- 
rem 17.4 to the case in which V is a subspace of L?(R?) that is invariant 
under the action of SO(3). In that case, L3 will be precisely (the restriction 
to V of) the dimensionless angular momentum operator Jy. 

Observe that Theorem 17.4 bears a strong similarity to our analysis of 
the quantum harmonic oscillator. In both cases, we have a “chain” of eigen- 
vectors for a certain operator, along with raising and lowering operators 
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that raise and lower the eigenvalue of that operator. In the case of the 
harmonic oscillator, we have a chain that begins with a ground state and 
then extends infinitely in one direction. In the case of so(3) representations, 
we have a chain that is finite in both directions. The chain begins with an 
eigenvector vo for L3 with maximal eigenvalue, so that vp is annihilated 
by the raising operator Lt. A key step in the proof of Theorem 17.4 is to 
determine how the chain can terminate (in the direction of lower eigenval- 
ues for L3) without violating the commutation relations among L3, L*, 
and L~. 

Proof of Theorem 17.4. Since 7 is a Lie algebra homomorphism, the 
m(F;)’s satisfy the same commutation relations as the F;’s themselves. 
From this we can easily verify the following relations among the operators 
Lt, L~, and Ls: 


[L3, L*] = Lt (17.8) 
[L3,L~] = —L~ (17.9) 
[L*, L-] =2L3. (17.10) 


Now, since we are working over the algebraically closed field C, the operator 
L3 has at least one eigenvector v with eigenvalue \. Consider, then, L*v. 
Using (17.8), we compute that 


LgL*v = (E* Lg +L*\u = L* (Ww) + De = (A+ Ee. (1711) 


Thus, either Ltv = 0 or Ltv is an eigenvector for L3 with eigenvalue 
A+ 1. We call L* the “raising operator,” since it has the effect of raising 
the eigenvalue of L3 by 1. 

If we apply Lt repeatedly to v, we obtain eigenvectors for L3 with eigen- 
values increasing by 1 at each step, as long as we do not get the zero vector. 
Eventually, though, we must get 0, since the operator D3 has only finitely 
many eigenvalues. Thus, there exists k > 0 such that (L+)*v 4 0 but 
(L+)**+1y = 0. By applying (17.11) repeatedly, we see that (L+)*v is an 
eigenvector for D3 with eigenvalue A+ k. 

Let us now introduce the notation vp := (L*)*v and w = A+k. Then vo 
is a nonzero vector with Lt vg = 0 and L3vg = pvp. We now forget about 
the original vector v and eigenvalue and consider only vg and ys. Define 
vectors vu; by 

=e Py: FSU, Fea 


Arguing as in (17.11), but using (17.9) in place of (17.8), we see that L~ 
has the effect of either lowering the eigenvalue of L3 by 1 or of giving the 
zero vector. Thus, L3u; = (wu — j)v;. 

Next, we claim that for 7 > 1 we have 


Ltv; =j(Qu+1—-j)vj, 7 =1,2,3,..., (17.12) 
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which is easily proved by induction on j, using (17.10) (Exercise 2). Since, 
again, L3 has only finitely many eigenvectors, v; must eventually be zero. 
Thus, there exists some N > 0 such that vy 4 0 but vy4, = 0. Since 
un+1 = 0, applying (17.12) with 7 = N gives 


0= Ltuny1 => (N + 1)(2u = N)un. 


Since vy #0 and N+1> 0, we must have (2u—N) = 0. This means that 
ju must equal N/2. 

Letting | = N/2 and putting p = N/2 = 1, we have the formulas recorded 
in (17.6). Meanwhile, since the v;’s are eigenvectors for L3 with distinct 
eigenvalues, the v;’s are automatically linearly independent. Furthermore, 
the span of the v,’s is invariant under L*, L~, and Ls, hence under all of 
so(3). Since V is assumed to be irreducible, the span of the v;’s must be 
all of V. Thus, the v;’s form a basis for V. The dimension of V is therefore 
equal to the number of v;’s, which is N+1=2/+1. mg 
Proof of Theorem 17.6. We construct V simply by defining a space 
V with basis vo,v1,...,U2, and defining the action of so(3) by (17.6). It 
is a simple matter (Exercise 4) to check that L+, L~, and Ls, defined in 
this way, have the correct commutation relations, so that V is indeed a 
representation of so(3). 

It remains to show that V is irreducible. Suppose that W is an invariant 
subspace of V and that W 4 {0}. We need to show that W = V. To 
this end, suppose that w is some nonzero element of W, which we can 
decompose as w = pele aj;v;. Let jo be the largest index for which a; is 
nonzero. According to the formula for L* in (17.6), applying Lt to any 
of the vectors v1,...,U21 gives a nonzero multiple of the previous element 
in our chain. Thus, (ZL+)/°w will be a nonzero multiple of vp. Since W 
is invariant, this means that vo belongs to W. But then by applying L— 
repeatedly, we see that v; belongs to W for each 7, so that W = V. 

Theorem 17.4 tells us that any irreducible representation of so(3) of di- 
mension 2/ + 1 has a basis as in (17.6). We can then construct an isomor- 
phism between any two irreducible representations by mapping this basis 
in one space to the corresponding basis in the other space. @ 

In the rest of this section, we look at some additional properties of rep- 
resentations of so(3). 





Proposition 17.7 Let m : so(3) > gl(V) be an irreducible representation 
of so(3). Then there exists an inner product on V, unique up to multiplica- 
tion by a constant, such that 7(X) is skew-self-adjoint for all X € so(3). 


Proof. Recalling how the operators L3, L+, and L~ are defined, we can 
see that the assertion that each 7(X), X € so(3), is skew-self-adjoint is 
equivalent to the assertion that L3 is self-adjoint and that L* and L~ 
are adjoints of each other. Since the v,;’s are eigenvectors for L3 with dis- 
tinct eigenvalues, if L3 is to be self-adjoint, the v;’s must be orthogonal. 
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Conversely, if we have any inner product for which the v,;’s are orthogonal, 
then Lz will be self-adjoint, as is easily verified. 

It remains to investigate the consequences of the condition (L*)* = L~. 
Assuming this condition, we compute that 


(v5,0j) = (L~0j-1, L~vj-1) = (0j-1, LL“ vj-1). 


But LtL~ = L~L* + 2L3. Furthermore, L3vj;-1 = (1 — 7 + 1)vj-1 and 
Ltvyj-1 = (j = 1)(21 = j + 2)u;-1 and, thus, 


(v;,05) = (vj—1, L* L~vj-1) 
= Gj = 1)(21 = j + 2) (u;-1, L~v;-2) + 2(1 =4 + 1) (uj—-1, Uj—1) Fi 
Recalling that L~v;-2 = v;-1 and simplifying gives 
(vj, vj) = (21 — 9 +1) (vj~-1, vj-1). (17.13) 


It is easy to see that if the v;’s are orthogonal, then L* and L~ are adjoints 
of each other if and only if the normalization condition (17.13) holds for 
j = 1,2,...,2l. Since 7(21 — 7 +1) is positive for each such j, there is no 
obstruction to normalizing the v;’s so that this condition holds, and so an 
inner product with the desired property exists. Since the only freedom of 
choice in defining the inner product is the normalization of vp, the inner 
product is unique up to multiplication by a constant. 


Proposition 17.8 Suppose (7,V) is an irreducible representation of so(3) 
of dimension 21+ 1. Define the Casimir operator C,; € End(V) by the 
formula 

Cz = n(F,)? + n(F2)? + n(F3)?. 


Then for all v € V, we have 
Cru = —l(l+ 1)v. 


Proof. See Exercise 3. m 

If we look at the proof of Theorem 17.4, we see that the only place in 
which irreducibility was used is in showing that the span of vg, v1,..., Vai 
is equal to V. We can therefore obtain the following result, which will be 
used in Sect. 17.9. 


Proposition 17.9 Let (7,V) be any finite-dimensional representation of 
so(3), not necessarily irreducible. Suppose vo is a nonzero element of V such 
that L+v9 = 0 and L3v9 = Avo for some ’ € C. Then X is equal to a non- 
negative integer or half-integer |. Furthermore, the vectors vo,V1,...,V2l 
defined by 

w= (2 ug, FH 0,1jss., 21, 


span an irreducible invariant subspace of V of dimension 21+ 1, and LT, 
L~, and Lz act on these vectors according to the formulas in Theorem 17.4. 
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In general, given a finite-dimensional representation (7,V) of a Lie 
algebra and a nonzero vector ug € V, we say that vg is a cyclic vec- 
tor for V if the smallest invariant subspace of V containing vo is all 
of V. In Proposition 17.9, the vector vp is certainly a cyclic vector for 
W := span(vo,..., V2). It should be noted, however, that a representation’s 
having a cyclic vector does not, in general, mean that the representation 
is irreducible (Exercise 5). Thus, the irreducibility of W is not the result 
of some general result about cyclic vectors, but holds only because of the 
assumed special properties of the vector vo. 
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Having classified the irreducible representations of the Lie algebra so(3), 
we now turn to the classification of the representations of the group SO(3). 
Since SO(3) is connected (Exercise 13 in Chap. 16), Proposition 16.39 tells 
us that a representation of SO(3) is irreducible if and only if the associated 
Lie algebra representation is irreducible, and that two representations of 
SO(3) are isomorphic if and only if the associated Lie algebra represen- 
tations are isomorphic. Thus, to classify the irreducible representations of 
SO(3) up to isomorphism, we merely have to determine which irreducible 
representations of the Lie algebra so(3) come from a representation of the 
group SO(3). 


Proposition 17.10 Let 7 : so(3) > gl(V) be an irreducible representation 
of so(3), with spin 1 := $(dimV — 1). Ifl is an integer (i.e., if the dimension 
of V is odd), then there exists a representation I; : SO(3) + GL(V) such 
that I, and 7, are related as in Theorem 16.23. If l is a half-integer (i.e., 
if the dimension of V is even) then no such representation Il; exists. 


It follows from this result and Proposition 16.39 that the irreducible 
representations of the group SO(3) are precisely the II;’s for which / is an 
integer. 

Proof. If J is a half-integer, then Lz is diagonal in the basis {v;}, with 
eigenvalues being half-integers. Thus, 


e2rmi(F3) — e2tiLs = —Y. 
(Here the “x” in front of 7; is the number 7 = 3.14....) On the other hand, 
by a simple modification of Example 16.16, we can see that the matrix 


F3 € so(3) satisfies e?"** = J. Thus, if a corresponding representation IT; 
of SO(3) existed, we would have 


I, (Z) _ Il, (es) _— e2tni(Fs) =f 


which is a contradiction. 
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If 7 is an integer, we make use of the isomorphism ¢ between su(2) 
and so(3) described in the proof of Example 16.32, which maps the ba- 
sis {F,, Ea, E3} of su(2) to the basis {F,, F2, F3} of so(3). We obtain a 
representation 7; of su(2) by setting m/(X) = m,(¢(X)). Since SU(2) is sim- 
ply connected, Theorem 16.30 tell us that there is a representation I} of 
SU(2) related to 7; in the usual way. We then compute that 


i, (=<1) _ ue (rs) = e2t mi (F1) = e271) _ e2tils = 7. 
since the eigenvalues of Ls are integers. 

Now, by Example 16.34, there is a surjective homomorphism ® from 
SU(2) onto SO(3) for which the associated Lie algebra homomorphism is ¢, 
and ker ® = {I,—I}. Since the kernel of II} contains {Z,—J}, the map IT) 
factors through SO(3), giving a representation II; of SO(3) such that I} = 
II,o®. By Exercise 10 in Chap. 16, the associated Lie algebra representation 
o1 of so(3) satisfies 7} = 01 0 ¢, so that 07 = 7/0 ¢~! = m. Thus, II; is the 
desired representation of SO(3). m 


17.6 Realizing the Representations Inside L?(S7) 


In this section, we deviate from the traditional treatment in the physics lit- 
erature by thinking of the “spherical harmonics” as restrictions to the unit 
sphere of certain polynomials on R*, rather than describing the spherical 
harmonics in angular coordinates on the sphere. Our approach avoids some 
messy computations in polar coordinates and it also generalizes readily to 
higher dimensions. 

Recall from Sect. 17.3 that there is a natural unitary representation II : 
SO(3) > L?(R°) given by II(R)d(x) = ¥(R7'z). In solving rotationally 
invariant problems such as the quantum hydrogen atom, it will be useful 
to understand the structure of finite-dimensional subspaces V of L?(IR?) 
such that V is invariant under II and such that the restriction of II to V is 
irreducible. 

If we write functions on R° in polar coordinates, then SO(3) acts only on 
the angle variables. Thus, it is useful to consider also the action of SO(3) 
on L?($7), given by the same formula as for L?(R*), namely 


(TI(R)v)(x) =o(Ro'x), xe S*. 


In computing the norm for L?(S?), we use the surface area measure on 
S?, which is invariant under the action of SO(3). Once we have found 
invariant subspaces inside L?(S7), it is a simple matter to produce invariant 
subspaces inside L?(R*) as well, as we will see in the next section. 


17.6 Realizing the Representations Inside L?(S”) 377 


We will be interested in this section in harmonic polynomials on R?°, that 
is, polynomials p satisfying Ap = 0, where A is the Laplacian. Since we 
always consider representations over C, we allow these polynomials to have 
complex coefficients. 


Definition 17.11 Let 1 be a non-negative integer. Define a subspace V; of 
L?(S?) by setting V; equal to the space of restrictions to S? of harmonic 
polynomials on IR? that are homogeneous of degree |. Then V, is called the 
space of spherical harmonics of degree l. 


Note that if p is a homogeneous polynomial on R? of some degree 1, then 
the restriction of p to S$? is identically zero only if p itself is identically zero. 
After all, if p is homogeneous of degree | and zero on $7, then 


v(x) = bxi'» (=) = 


for all x ¢ 0, and hence, by continuity, for all x € R%. (By contrast, the 
nonzero, nonhomogeneous polynomial p(x) := «} +23 +23 —1 is identically 
zero on S*.) We are therefore free to shift back and forth between thinking 
of the elements of V; as functions on S? or as functions on R®. 

It is well known that the Laplacian A commutes with rotations. It follows 
that each V; is invariant under the action of the rotation group. We will 
eventually see that V; is irreducible under this action. 

Every homogeneous polynomial of degree 0 or 1 is harmonic. Thus, Vo 
consists of the constant functions on S? and V, is spanned by the restric- 
tions to S? of the functions 71, x2, and 73. Meanwhile, the space of homoge- 
neous polynomials of degree 2 is 6-dimensional, and the space of harmonic 
polynomials that are homogeneous of degree 2 is spanned by the following 
five polynomials: x72, r2%3, 7321, v} — £3, and x3 — x3. (The polynomial 


x} — x3 is also harmonic, but it is just the sum x7 — x3, and x3 — x3.) 


Theorem 17.12 The spaces V; have the following properties. 
1. Each V, has dimension 21+ 1. 


2. Each V, is invariant under the action of the rotation group and 
irreducible under this action. 


3. Forl 4m, the spaces V and Vin are orthogonal in L?(S7). 


4. The Hilbert space L?(S?) decomposes as the orthogonal direct sum of 
the V;’s, as | ranges over the non-negative integers. 


The remainder of this section will be devoted to the proof of 
Theorem 17.12. We proceed in a series of lemmas, along with some corol- 
laries of those lemmas. 
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Lemma 17.13 Let P denote the space of polynomials on R® with complex 
coefficients. There exists an inner product (-,-) on P with the property that 


(p, Adq)p = (xp, q)p ’ 


where 
2 2 2 2 
w= X4{~+%9+ £3. 


Proof. Although it is possible to give a combinatorial construction of the 
desired inner product, we can also give an analytic construction. Every 
polynomial p on R? certainly has a holomorphic extension to C?, denoted 
pc. We may define, then, 





e7lzl?/2 6 
(7,4)p = 7 pelacla) a dz, 


which is nothing but the inner product of pc and qc as elements of the 
Segal-Bargmann space HL?(C%, 1). According to Lemma 14.12, we have 


dqc elzl?/2 e7lel?/2 
i, Pale) 5 ae ca L 2jpc(z)9c(2) —a79— dz 








for all p,q € P and all 7 = 1, 2,3. This relation means that 


0 
(v5) = (25D, Qn 
Til p 


from which we readily obtain the desired property of our inner product. 

A standard bit of elementary combinatorics shows that the number of 
ordered triples (11, 12,13) with 1, + ly +13 = 1 is equal to (J + 2)(1 + 1)/2. 
Since the monomials ee a2 ais with 1; +12 +13 =1 form a basis for P), we 


have dim P; = (1+ 2)(1 + 1)/2. 


Corollary 17.14 If P; denotes the space of polynomials on R® that are 
homogeneous of degree |, then the Laplacian A maps P; onto Pi—2 for all 
1>2. Thus, for alll > 2, we have 


dim V; = dim P; — dim P)—2 
(d+2)\@+1) Ul-1) 
2 2 








=21+1. 


Proof. Let us equip the finite-dimensional spaces P; and P)_2 with the 
inner product from Lemma 17.13. It is easy to see that the statement, 
“The orthogonal complement of the image is the kernel of the adjoint,” 
applies to linear maps of one finite-dimensional inner product space to 
another. Applying this to A : P; > Pi_-2, we note that the adjoint of A is 
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multiplication by x”, which is clearly injective, since 7 + x3 + x3 is zero 
only at the origin. Thus, the orthogonal complement of the image of A is 
{0}. Since the spaces are finite-dimensional, this means that A maps P, 
onto Pi_2. 


Corollary 17.15 Let 1 be a non-negative integer and let k = 1/2 if l is 
even and let k = (l—1)/2 if 1 is odd. Then each p € P; can be decomposed 
in the form 


p(x) = po(x) + |x|” pr (x) + |x|* po(x) +--+ + [x|?" pe (x), 


where each p;(x) is a harmonic polynomial that is homogeneous of degree 
1 — 2j. In particular, the restriction of p to S? satisfies 


P\g2 = (po + pi ++++ +Pk)|g2; 
where po + pi +++: + px is a (nonhomogeneous) harmonic polynomial. 


Given any polynomial p, not necessarily homogeneous, we can apply 

Corollary 17.15 to each homogeneous piece of p. We see, then, that given 
any polynomial p, there exists a harmonic polynomial p such that p and p 
have the same restriction to S?. 
Proof. We proceed by induction on /. If 1 = 0 or 1 = 1, then all p € P; 
are harmonic and the desired decomposition is simply p = po. Consider, 
then, some / > 2 and assume the result holds for all degrees less than 1. 
Lemma 17.13 tells us that P; decomposes as an orthogonal direct sum of 
the kernel of A and the image of P)_2 under multiplication by |x|? . Thus, 
any p € P; can be decomposed as p = po + ||? qo, where po is harmonic 
and qo belongs to P}_2. By induction, go has a decomposition of the desired 
form; substituting this in for go in the decomposition p = po + |x|? do gives 
the desired decomposition of p. 

To show that V, is irreducible under the action II of SO(3), we pass to 
the Lie algebra. Since, as we have remarked, restriction to the sphere is 
injective on homogeneous polynomials, we may think of the elements of V; 
as polynomials on R?, in which case, the Lie algebra action 7 associated 
with II is given in terms of the usual angular momentum operators. 


Lemma 17.16 As in Theorem 17.4, let L3 = in(f3) = Jz and let Lt = 
in(F,) — 1(£2) = Jy + ido. For any non-negative integer |, the polynomial 
p(a1, 22,23) := (x1 +i)! belongs to Vi and satisfies 


L3p = lp 


and 
ipa; 
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Proof. Since it is independent of x3 and holomorphic as a function of 
Z:= 2, +ix2, the polynomial p is automatically harmonic, which can also 
be verified by direct calculation. Meanwhile, applying L3 to p gives 


Dee. 8 \ oy + ima)! 
a "1 Ono 72 Bop: 1+ tr 


= —i [ail(ay + tag)! (8) — eal (ay + ixe)'1] 


= (ay + ixg)!. 





Finally, applying L* := in(F,) — 1(F2) to p gives 


f(a a ee 
2 0x3 3 Ox» = "3 Oa, "] Ba3 - 


= —t(—ax3l (a1 + ix)’ 1(i)) + x3l(xy 4 iv2)'—(1) 
=0, 





as claimed. @ 
Corollary 17.17 The space V; is irreducible under the action of SO(3). 


Proof. By Proposition 17.9, if we apply L~ repeatedly to the polynomial 
p, we obtain a “chain” of eigenvectors of length 2] + 1. These eigenvectors 
span an irreducible invariant subspace of dimension 2] + 1. Since we have 
already established that dimV; = 21+ 1, the elements of the chain must 
span V;, which implies that V; is irreducible. m 

We have now assembled all the pieces necessary for a proof of the main 

result of this section. 
Proof of Theorem 17.12. We have already proved Points 1 and 2 of the 
theorem in Corollaries 17.14 and 17.17, respectively. Now, each V; is an 
irreducible representation of SO(3), and no two of the V;’s can be isomor- 
phic, because they all have different dimensions. Thus, by Exercise 19 in 
Chap. 16, V; and V,, must be orthogonal inside L?($7) for 1 4 m, which is 
Point 3. 

Finally, by the Stone—Weierstrass theorem and the density results of 
Theorem A.10, the restrictions to S? of polynomials on R® form a dense 
subspace of L?($7). But Corollary 17.15 shows that the space of restric- 
tions to S$? of polynomials coincides with the space of restrictions to $? 
of harmonic polynomials. Thus, the span of the V;’s is dense in L*(S*), 
establishing Point 4. m 


17.7 Realizing the Representations Inside L?(R*) 


Recall that for homogeneous polynomials on R?, the restriction map from 
R® to S? is injective. Thus, we may think of the space V; equally well as 
a space of functions on S$? (as in the previous section) or as a space of 
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functions on R*. In this section, then, we will let V; denote the space of 
harmonic polynomials on R® that are homogeneous of degree I. 


Definition 17.18 Suppose | is a non-negative integer and f is a measur- 
able function on (0,00) such that 


[tert ar <n. (17.14) 
0 


Let V;,¢ C L?(R*) denote the space of functions w of the form 


(x) = p(x) F (xl), (17.15) 
where p € Vj. 


The condition on f(r) is precisely what one needs to make y(x) a square- 
integrable function on R® (compute the LZ? norm in spherical coordinates). 

Definition 17.18 is not the one that physicists typically use. In the physics 
literature, one sees a functions of the form 


V(X) = Yim(9, ¢) g(r), (17.16) 


where r, 6, and ¢ are the usual spherical coordinates. Here Yj, is the re- 
striction to the sphere of a particular harmonic polynomial that is homoge- 
neous of degree 1, written in spherical coordinates. (Up to a normalization 
factor, the Yim’s are obtained by using the basis for V; in Theorem 17.4.) 
Thus, if we move along a ray from the origin in R3, only the value of g(r) 
changes. By contrast, in (17.15), as we move along a ray, the p(x) factor 
contributes a factor of r’. We can write the physics expression in rectangular 
coordinates as 


600) = ¥im (7) tl) 





- Yin (17.17) 


For computational purposes, the expression (17.15) is more convenient 
than (17.17); in fact, in the analysis of the hydrogen atom, physicists mul- 
tiply by r! at some later point in the calculation, just so that the relevant 
differential equation will take on a simpler form. 


Proposition 17.19 Every space of the form Vif C L?(R*) is invari- 
ant and irreducible under the action of SO(3). Conversely, every finite- 
dimensional, irreducible, SO(3)-invariant subspace of L?(R*) is of the form 
Vp for some non-negative integer | and some f satisfying (17.14). 
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Proof. Since the factor f(|x|) is invariant under rotations, the action of 
SO(3) only affects the function p. Thus, V;,7 is isomorphic, as a represen- 
tation of SO(3), to the space V;, which is irreducible by Theorem 17.12. 

For the other direction, the Lebesgue measure on R? decomposes as a 
product of the surface area measure on S? with the measure 47r? dr on 
(0,00). Thus, by a standard measure-theoretic result (Proposition 19.12), 
L? (IR?) decomposes canonically as the Hilbert tensor product of L?(S?) 
and L?((0,00)), where a vector of the form f ®g in the tensor product cor- 
responds to the function f (0, ¢)g(r) in L?(IR°), as in (17.16). Since L?(S?) 
decomposes (Theorem 17.12) as the sum of the spaces V;, 1 = 0,1,2,..., 
we can decompose L?(R*) as sum of spaces of the form 


Vik = Vi ® Gk; 


where the g,’s form an orthonormal basis for L*((0,00)). 

Now, let V be any finite-dimensional, irreducible, SO(3)-invariant 
subspace of L?(IR?). Let m,~ : L?(R°) + Vix be the orthogonal projec- 
tion operator, and let p;,, be the restriction of m,, to V. This map is easily 
seen to be an intertwining map for the action of SO(3). Thus, since both V 
and V;,;, are irreducible, Schur’s lemma tells us that each p;,;, is either zero 
or an isomorphism. Furthermore, since the spaces V;, are nonisomorphic 
for different values of 1, we cannot have both px, and px, being nonzero 
for 1 £1’. On the other hand, px. cannot be zero for all k and 1, since the 
V;.1’s span L?(R3). Thus, there must be some value lp of J such that pig,k, 
is nonzero for some kg but such that p;,, = 0 for all] F Io. 

Applying Schur’s lemma again, we see that pi,,%(Pio,ko) 1 must be of the 
form c,J for each k. Given any wW € V, let v be the unique element of V 
such that piy,4o(W) = V ® Gro. Then we have 


Plg,k(W) = ck(v ® gr) 


for every k. Since also p;,4(¢) = 0 for 1 4 lo, we conclude that 7 must be 
of the form v © g, where 
I= 3 Ck9k- 
k 


Since this holds for each q € V (with the same set of constants cz), we see 
that V = V,, ® g, which is nothing but the form in (17.16). Then V is of 
the form claimed in the proposition, where f(r) = g(r)/r’?. = 

It can further be shown that each closed, SO(3)-invariant subspace of 
L?(IR°) decomposes as an orthogonal direct sum of finite-dimensional, ir- 
reducible, SO(3)-invariant subspaces. This result is just a special case of a 
general result for strongly continuous unitary representations of compact 
topological groups. (See, e.g., Chap. 5 of [10].) Since we already know that 
L?(R°) is a direct sum of finite-dimensional, irreducible invariant subspaces, 
it is probably possible to give an elementary proof of this result, but we 
will not pursue that approach here. 
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We classified irreducible finite-dimensional representations of the Lie 
algebra so(3) by their “spin” 1, where I is the largest eigenvalue for the 
operator L3 = in(F3). The possible values for | are non-negative integers 
(0,1,2,...) and the positive half-integers (1/2,3/2,...). Inside L?(S*) and 
L?(IR°), however, we found only irreducible representations of so(3) with 
integer spin. It is easy to understand why the half-integer spin represen- 
tations do not occur: They do not correspond to any representation of the 
group SO(3). Since L?($?) and L?(R3) both carry a natural unitary action 
II of the group SO(3), any finite-dimensional subspace that is invariant un- 
der the associated Lie algebra representation 7 will also be invariant under 
II and thus constitute a representation of SO(3). 

Although the half-integer representations 7, of the Lie algebra so(3) can- 
not be exponentiated to representations of SO(3), they can be exponenti- 
ated to representations of the universal cover SU(2) of SO(3), as in the proof 
of Proposition 17.10. For a half-integer /, the associated representation IT} of 
SU(2) satisfies II}(—I) = —I, which means that II) does not factor through 
S$O(3) = SU(2)/{7, —I}. If, however, we think about projective representa- 
tions, we see that [—J] is the identity element in PU(V). Thus, even when | 
is a half-integer, we get a well-defined projective representation IT; of SO(3) 
that satisfies 


Tl(e’*) = [et™(X)] 


for all X € so(3), where [U] denotes the image of U € U(V) in PU(V). 

It is generally believed that the physics of the universe is invariant under 
the rotation group SO(3). This does not mean that one never considers 
models without rotational symmetry, because the local environment of, 
say, a hydrogen atom in a magnetic field breaks the rotational symmetry of 
the hydrogen atom. Nevertheless, if we were to rotation both the hydrogen 
atom and the magnetic field, the physics of the problem would not change. 
In quantum mechanics, rotational symmetry means that there should be 
a projective unitary representation of SO(3) on the Hilbert space of the 
universe that commutes with the Hamiltonian operator. Now, the Hilbert 
space of the universe (if there is such a thing) is built up out of Hilbert 
spaces for each type of particle. Thus, we expect that the Hilbert space 
for a single particle will also carry a projective unitary representation of 
SO(3). 

The simplest possibility for the Hilbert space of a single particle is the 
Hilbert space L?(R?), which certainly carries an (ordinary) unitary action 
of SO(3), as we have been discussing in this chapter. Based on various ex- 
perimental observations, however, physicists have proposed a modification 
to the Hilbert space for an individual particle that incorporates “inter- 
nal degrees of freedom.” The proposal is that for each type of particle, 
the quantum Hilbert space should be of the form L?(R°)@V, where V 
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is a finite-dimensional Hilbert space that carries an irreducible projective 
unitary representation of S$O(3). Here & is the Hilbert tensor product (Ap- 
pendix A.4.5). The (projective) action of SO(3) on V describes the action 
of the rotation group on the internal degrees of freedom of the particle. 

Now, according to Proposition 16.46, the space V carries a (trace-zero) 
ordinary representation 7 of the Lie algebra so(3). In customary physics 
terminology, the largest eigenvalue | of the operator L3 := im(F3) in V is 
then called the spin of the particle. We then denote the space V by V; to 
indicate the value of the spin. Electrons, for example, are “spin 1/2” par- 
ticles, meaning that the Hilbert space for a single electron is L?(R°)@Vj/2, 
where Vj /2 is a two-dimensional projective representation of SO(3). 

It is easy to see that the tensor product of two projective unitary repre- 
sentations of a given group is again a projective unitary representation of 
that group. (By contrast, the direct sum of two projective unitary repre- 
sentations is in general not again a projective unitary representation.) In 
the case at hand, we can think of L?(IR) as carrying a unitary representa- 
tion I of SU(2) that factors through SO(3), that is, for which II(—J) = I. 
Meanwhile, we can think of V; as a carrying a unitary representation I 
of SU(2) in which I;(—I) = +I, with the plus sign if / is an integer and 
the minus sign if / is a half-integer. Thus, L?(R*®)®V; carries a unitary rep- 
resentation IT @ II; of SU(2) in which (II ® I;)(—I) = +/. Thus, in the 
projective sense, II @ I; factors through SO(3). 








Summary 17.20 (Spin) Each type of particle has a “spin” 1, which is a 
non-negative integer or half-integer. The Hilbert space for such a particle 
is L?(R°)®V,, where V; is an irreducible projective representation of SO(3) 
of dimension 21 + 1. 


Since V;, is finite dimensional, the Hilbert tensor product L?(R*)@V; co- 
incides with the algebraic tensor product of L?(R°) with Vj. 


Definition 17.21 A particle for which the spin is an integer is called a bo- 
son, and a particle for which the spin is a half-integer is called a fermion. 


To see the significance of the distinction between integer and half-integer 
spin, one needs to look at the structure of the Hilbert space describing 
multiple particles of a given type, such as the Hilbert space for five electrons. 
This topic is discussed in Chap. 19. 


17.9 Tensor Products of Representations: 
“Addition of Angular Momentum” 


Let V; and V,, be irreducible representations of so(3) with dimensions 2/+ 1 
and 2m + 1, respectively. As discussed in Sect. 16.8, the tensor product 
space V; ® Vm can be viewed as another representation of so(3). Unless 
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one of | and m is zero, V; ® Vj, is not irreducible. It is of interest, then, 
to decompose V; ® V,, as a direct sum of irreducible invariant subspaces. 
This decomposition—in the case that V; is an irreducible SO(3)-invariant 
subspace of L?(IR?) and V,, is the space of internal degrees of freedom of a 
particle—will help us in decomposing the Hilbert space for a particle with 
spin into irreducible, SO(3)-invariant subspaces. 


Proposition 17.22 Let Vi/2 be an irreducible representation of so(3) of 
dimension 2, and let V; be an irreducible representation of so(3) of dimen- 
sion 21+ 1, where | is a non-negative integer or half-integer. If | = 0, 
Vi @ Vi /2 is irreducible. If > 0, then we have 


Vi @ Viyo = Visis2 ® Vi-re, 
where ‘“~” denotes an isomorphism of representations. 


Proof. If / = 0, then it is easy to see that Vi @ V;/2 is isomorphic to Vi/2, 
which is irreducible. Assume, then, that / > 0. 

Let L*+, L~, and L3 be the operators in Theorem 17.4, constructed using 
the representation 7, and let o*, o~, and o3 be the analogous operators 
constructed using the representation 7/2. As in Sect. 16.8, we define oper- 
ators Jt, J~, and J3 on Vi ® Viz by 


J’ =L* @l+I@et 
J~-=L @I+1@8a- (17.18) 
J3 = Ll3@I+]1 ®o3. 





Let {vo,...,V27} be a basis for V; as in Theorem 17.4, and let {eo, e1} be 
a similar basis for Vj/2. Then the vectors of the form vj @ e, form a basis 
for Vi ® Vi/2. The eigenvalues of Jz are the numbers of the form 


@-9+(5-4), 


j =0,1,...,21, k = 0,1. Thus, the eigenvalues of J3 range from 1 + 1/2 to 
—(1+1/2). The numbers / + 1/2 and —(J + 1/2) occur as eigenvalues only 
once. All other eigenvalues occur twice, once as (A — 1/2) + 1/2 and once 
as (A+ 1/2) — 1/2. 

The vector vp © eo is an eigenvector for Jz; with the largest possible 
eigenvalue 1 + 1/2, so that Jt (vp @ eg) = 0. According to Proposition 17.9, 
if we apply J~ repeatedly, we will obtain a “chain” of eigenvectors of length 
21+ 2, and the span of these vectors forms an irreducible invariant subspace 
Wo isomorphic to Vj+1/2- 

Now, by Proposition 17.7, there exist inner products on VY; and Vi/2 
that make m and 7/2 “unitary,” meaning that (X)* = —(X) for all 
X €so(3). If we use on V; ® Vi/2 the natural inner product, obtained from 
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the inner products on V; and Vj/2 as in Appendix A.4.5, then 7 ® 7/2 is 
also unitary. Thus, the orthogonal complement of the invariant subspace 
Wo is also invariant. Since all eigenvalues for J3 except the largest and 
smallest have multiplicity 2, we see that the largest eigenvalue for Jz in 
Wo is 1—1/2. Let wo € Wo be an eigenvector for J3 with eigenvalue 
1—1/2. If we repeatedly apply the lowering operator J~ : L~ ®@I+I@oa7~ 
to wo, we will obtain a chain of eigenvectors of length 21. These eigenvectors 
span an irreducible invariant subspace W, of V;@Vj/2 of dimension 2/. Since 


dim Wo + dim W; = 41 + 2 = dim(Vj @ Vi 2), 


we must have W, = W;/, completing the proof. m™ 

Since an electron is a “spin 1/2” particle, the Hilbert space for a single 
electron is, according to Sect.17.8, L?(R°)®Vj/2, where V,/2 is an irre- 
ducible projective unitary representation of SO(3) of dimension 2. Mean- 
while, in Sect. 17.7, we saw how to find irreducible, S$O(3)-invariant sub- 
spaces Vi, of L?(R*) of dimension 21 + 1, for | = 0,1,2,..., where f is 
an arbitrary radial function. By applying Proposition 17.22 to the case 
V, = Vif, we obtain irreducible SO(3)-invariant subspaces of the Hilbert 
space L*(R?)®Vj/2. Finding such subspaces is essential in, for example, 
analyzing the fine structure of the hydrogen atom. 

In the case that V; is an SO(3)-invariant subspace of L?(IR°), the for- 
mula for, say, the operator J3 in (17.18) 17.22 is written in the physics 
literature as 

J3 = [3 + 03, (17.19) 


where it is understood that L3 acts on the first factor in the tensor prod- 
uct and o3 acts on the second factor. (That is to say, the tensor product 
with the identity operator is understood and thus not written.) Here L3 is 
the ordinary angular momentum operator and o3 describes the action of 
the basis element F3 € so(3) on the space V;/2. Formulas such as (17.19) 
account for the physics terminology “addition of angular momentum” to 
describe the analysis of tensor products of representations of so(3). In this 
context, the operator L3 (= L3® TJ) is called an orbital angular momentum 
operator, and the operator 03 (= [®o3) is called a spin angular momentum 
operator, and similarly for L~ and a 

We now record the general result on tensor products of irreducible rep- 
resentations of so(3). 














Proposition 17.23 For any j = 0,1/2,1,..., let V; denote the unique 
irreducible representation of so(3) of dimension 27 +1. Then for any 1 and 
m with l>m, we have 


Vi @ Vin = Vi+m ® Vi-m—1 © +++ ® Vi-m+i © Vi-m. (17.20) 


The proof of this result is similar to that of Proposition 17.22, and is 
omitted; see Theorem D.1 in Appendix D of [21]. An important property 
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of this decomposition is that each irreducible representation that occurs 
on the right-hand side of (17.20) occurs only once. This property of the 
representations of so(3) is the key idea in the proof of the Wigner—Eckart 
theorem. See Appendix D of [21] for details. 


17.10 Vectors and Vector Operators 


Definition 17.24 A function c : R® x R® > R? is said to transform like 
a vector if 
c(Rx, Rp) = R(c(x, p)) (17.21) 


for all R € SO(3). 


In the physics literature, the expression “is a vector” is sometimes used 
in place of “transforms like a vector.” 

Note that in Definition 17.24, we only consider the transformation prop- 
erty of c under elements of SO(3) rather than under a general element of 
O(3). If c transforms like a vector, one says that c is an “true vector” if c 
satisfies (17.21) for all R in O(3) [not just in SO(3)] and one says that c isa 
“pseudovector” if c satisfies c(Rx, Rp) = —R(c(x, p)) for R € O(3)\SO(3). 
For our purposes, it is not necessary to distinguish between true vectors 
and pseudovectors. 

The position function c1(x, p) := x, the momentum function c2(x, p) := 
p, and the angular momentum function c3(x, p) := x x p are simple exam- 
ples of functions that transform like vectors. (Transformation under rota- 
tions is one of the standard properties of the cross product.) A typical ex- 
ample of a function transforming like a vector is c(x, p) = (x-p) |x| (x x p). 


Proposition 17.25 Let j(x,p) = x x p denote the angular momentum 
function on R? x R* Suppose a smooth function c : R? x R3 — R® trans- 
forms like a vector. Then we have 


{ck, je} = 0 (17.22) 
fork =1,2,3. Furthermore, we have 
{c1, jo} = {ji, co} = 03 (17.23) 


and other relations obtained from (17.23) by cyclically permuting the 
indices. 


Proof. Let R(@) denote a counterclockwise rotation by angle 6 in the 
(%1,%2)-plane. Applying (17.21) with R = R(@) and looking only at the 
first component of the vectors, we have 


c1(R(6)x, R(P)p) = ci (x, p) cos 0 — co(x, p) sin @. (17.24) 
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Now, as in the proof of Proposition 2.30, the Poisson bracket {c1, 73} is 
precisely the derivative of the left-hand side of (17.24) with respect to 0, 
evaluated at 6 = 0. Thus, 


{c1,j3} = —e2 


and so {j3,¢1} = cg, which is one of the relations obtained from (17.23) by 
cyclically permuting the indices. 

Meanwhile, if we again apply (17.21) with R = R(0) but look now at the 
third component of the vectors, we have that 


c3(R(6)x, R(O)p) = c3(x, p). 


Differentiating this relation with respect to 6 at 6 = 0 gives {cs, 73} = 0. 
All other brackets are computed similarly. 

We now turn to the quantum counterpart of a function that transforms 
like a vector. 


Definition 17.26 For any ordered triple C := (C,C2,C3) of operators 
on L?(R?) and any vector v € R°, let v-C be the operator 


3 
veC=) uC), (17.25) 
j=l 


Then an ordered triple C of operators on L?(IR*) is called a vector oper- 
ator if 
(Rv) -C =TI(R)(v- C)II(R)~" (17.26) 


for all R € SO(3). 


Here II(-) is the natural unitary action of SO(3) on L?(R*) in Defini- 
tion 17.1. Let us try to understand what this definition is saying in the 
case of, say, the angular momentum, which is (as we shall see) a vector op- 
erator. The operators Fig Je, and J3 represent the components of J in the 
directions of e1, e2, and es, respectively. More generally, we can consider 
the component of J in the direction of any unit vector v, which will be 
nothing but v-J, as defined in (17.25). Since there is no preferred direction 
in space, we expect that for any two unit vectors v; and v2, the operators 
v,-J and v2-J should be “the same operator, up to rotation.” Specifically, 
if R is some rotation with Rv, = vo, then v, - J and Vo°- J should differ 
only by the action of R on the Hilbert space L?(R*). But this is precisely 
what (17.26) says, with v =v, and C= J: 


v2: J =T1(R)(v1 - J)T(R)“! 


We will not concern ourselves with the question of whether (17.26) 
continues to hold for R € O(3)\SO(3). The position and momentum opera- 
tors X and P are easily seen to be vector operators. As in the classical case, 
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the cross product of two vector operators is again a vector operator. (See 
Exercise 7 in Chap. 18.) In particular, the angular momentum, Jj=XxP 
is a vector operator. 

If the operators C,, C2, and C3 are unbounded, we should say something 
in Definition 17.26 about the domains of the operators in question. The sim- 
plest approach is to find some dense subspace V of L?(R?) that is contained 
in the domain of each Cj and such that V is invariant under rotations. In 
that case, the equality in (17.26) is understood to hold when applied to a 
vector in V. In many cases, we can take V to be the Schwartz space S(R°). 
In the following proposition, the space V should satisfy certain technical 
domain conditions that permit differentiation of (17.29) when applied to a 
vector ~ in V. We will not pursue the details of such conditions here. 


Proposition 17.27 If C is a vector operator, then the components of C 
satisfy 


1 n 
7G \Ci Jil = 0 (17.27) 


for 7 =1,2,3. Furthermore, we have 
—|C J] = —|J. C2] = C3 (17.28) 
in ais : , 


and other relations obtained from (17.28) by cyclically permuting the 
indices. 


Proof. As in the proof of Proposition 17.25, R(@) denote a rotation in the 
(x1, %2)-plane, and let e; = (1,0,0). Applying (17.26) with R = R(@) and 
v =e, we have 


TI(R(0))CiT(R(0))~* = C, cos 6 + C2 sin 8. (17.29) 


But R(0) = e°*3, where {F;} is the basis for so(3) described in Sect. 16.5. 
Thus, differentiating (17.29) with respect to @ at 6 = 0 gives 


1(P3)Cy = Ci n(F3) = Cp. 


Since Js = iha(F3) (Proposition 17.3), we obtain (1/(ih))[J3,Ci] = Co, 
which is one of the relations obtained from (17.28) by cyclically permuting 
the variables. 

Meanwhile, applying (17.26) with R = R() and v = eg gives 


II(R(9))CgI(R(O))“! = Cx. 


Differentiating this relation with respect to 6 at 0 = 0 gives [m(F3), C3] = 0. 
All other relations are obtained similarly. m 

For more information about vector operators, including the Wigner— 
Eckart theorem, see Appendix D of [21]. See also Exercise 7. 
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17.11 Exercises 


. Verify the expression (17.2) for the vector field 70/0xq — 720/021. 


. Verify the relation (17.12) in the proof of Theorem 17.4, using induc- 


tion on j and the commutation relation (17.10). 


. This exercise provides a proof of Proposition 17.8. Let (7, V;) denote 


an irreducible representation of so(3) of dimension 2/ + 1 and let C, 
denote the Casimir operator as defined in the proposition. 
(a) Show that [7(F}),C,] = 0 for all j = 1, 2,3. 


(b) Using Schur’s lemma, show that there is some A € C such that 
C,v = Av for all v € V. 


(c) Show that 





Cr =—(L3+L7-L++Ls), 
where Lt, L~, and L3 are as in Theorem 17.4. 


(d) By computing C, on some suitably chosen vector in V, show 
that the constant \ in Part (b) has the value —I(/ + 1). 


. Let | be any non-negative integer or half-integer. Construct a vec- 


tor space V by decreeing that vectors {vo,v1,...,V2:} form a basis 
for V. Define operators L*, L~, and L3 on V by the expressions 
in (17.6). Show that these operators satisfy the commutation rela- 
tions (17.8), (17.9), and (17.10). 


Hint: In the case of L~, treat the vector vg; separately from the other 
basis vectors. In the case of the Lt, treat the vector vg separately 
from the other basis vectors. 


. Let (7,V) be an irreducible representation of so(3) of dimension 2, 


with basis {vo, v1} as in (17.6). Consider V @ V as a representation 
of so(3) as in Sect. 16.8. Let v = (vo,v1). Show that the smallest 
invariant subspace of V @ V containing v is V @ V. 

Note: This shows that V @ V has a cyclic vector, even though V 6 V 
is not irreducible. 


. Compute explicit bases for the two irreducible invariant subspaces 


Wo = V3/2 and Wi & Vi/2 of Vi @ Vi/2. Each basis element for Wo 
or Wj should be expressed as a linear combination of the elements 
v; ® ex in the proof of Proposition 17.22. 


. Let Vi, Vin, and V,, be irreducible representation of so(3) of dimension 


21+ 1, 2m+1, and 2n +1, respectively. Suppose that ® and W are 
nonzero intertwining maps of V; into V,, ® V,. Show that ® = cW for 
some cE C. 
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Hint: Use Proposition 17.23 and Schur’s lemma. 


Note: This result is closely related to the Wigner—Eckart theorem for 
‘irreducible tensor operators.” 
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Radial Potentials and the Hydrogen 
Atom 


18.1 Radial Potentials 


If V is any radial function on R°, let H = —(h?/(2m))A +V be the 
corresponding Hamiltonian operator, acting on L?(IR°). We will look for 
solutions to the time-independent Schrodinger equation H wy = Ew of the 
form 7(x) = p(x) f (|x|), where f is a smooth function on (0,00) and p isa 
harmonic polynomial on R® that is homogeneous of degree I. 


Proposition 18.1 Let p be a harmonic polynomial on R? that is homoge- 
neous of degree | and let f be a smooth function on (0,00). Let w be the 
function on R3\{0} given by 


(x) = p(x) F(x!). (18.1) 
Then on R°\{0} we have 





eres a _ (1 : 1) A 


Proof. We begin with the case 1 = 0, so that p is a constant—which we 
take to be 1—and w is just the radial function f (|x|). Then 


0 = df d F 7 
Ba; (|x|) = de da; V™ + 29+ 23 


= 
dr |x| 
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and so 





3 2 3 df 2 df 1 a 
Y mld = >| see + (E =.)| 


For the general case, the product rule for the Laplacian gives 


Ag = (Ap) f (|x|) + 2Vp- VF (xl) + PAF (|x|). 


Now, Ap = 0 by assumption. Furthermore, since f(|x|) is radial, its gra- 
dient points in the radial direction. Thus, only the radial component of 
Vp is relevant. Moreover, on each ray through the origin, p behaves like a 
constant times r’. Thus, the r-derivative of p is (I/r)p, giving 


21 so fa f 
are Pp? 


2 df 


i 
ear 





Ay == 


which simplifies to the desired expression. 

Although the decomposition of functions in Definition 17.18 is for many 
purposes the most convenient one, it is not quite the customary way of turn- 
ing spherical harmonics into functions on R?. Conventionally, one works in 
polar coordinates and considers functions of the form 


V(r, 9, %) = pO, d)g(r), 


where p is the restriction to S? of an element of V;. We can express this 
decomposition in rectangular coordinates as 


v6) =p (Fr) abe) = Pathe. 


We can then obtain a more customary form of Proposition 18.1 as follows. 


Proposition 18.2 Suppose p € V; and f is a smooth function on (0,00), 
and let w by the function on R?\{0} given by 


v0 =p() ala). 


d?g  2dg W(l+1 
aa 1 a Mas bees (18.2) 


Then 





(Avy(rx) = plx) | 


r dr Tr 


for all x € S? andr € (0,00). 
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Proof. Since p is homogeneous of degree 1, 


Thus, 





Applying Proposition 18.1 gives 


Au(x) = p(x) E amare (4). 





From here it is straightforward but unilluminating calculation to verify the 
formula in the proposition. @ 
Still another way to write functions on R? is in the form 


1 x 
v9) = op (*) h(|xl), (18.3) 


so that h(r) = rg(r). If we replace g(r) by h(r)/r in (18.2), we obtain, after 
a short calculation, 





it dh 1(l+1) . 

(Aw) (rx) = me) E 2 h(r)}, xeES*. (18.4) 
Writing wave functions in the form (18.3) is convenient because we then 
have, for any radial potential, 





- Fav +V(x)u = Loto | +Vea(rnir)], (185) 


where Veg is the effective potential given by 
hU(1 +1) 


Qmr2 


Ver (r) = V(r) + (18.6) 

Note that the quantity in square brackets in (18.5) is just an ordinary one- 
dimensional Schrédinger operator, since the first derivative term in (18.2) 
has been eliminated. Despite the naturalness of the form (18.3), it is the 
form (18.1) that is ultimately most convenient for finding the bound states 
of the hydrogen atom Hamiltonian. 

Now, as the discussion following Proposition 9.34 illustrates, even if 
is square-integrable over R*\{0} and Aw is square-integrable over R®\{0}, 
yw may not be in the domain of the Laplacian, since the distributional 
Laplacian of w may contain a term that is supported at the origin. In 
the case of the hydrogen atom, however, we will consider functions w of 
the form (18.1) where f and df/dr are bounded near the origin and have 
exponential decay near infinity. Proposition 9.35 then tells us that w is in 
the domain of A. 
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18.2 The Hydrogen Atom: Preliminaries 


A hydrogen atom is formed out of a single electron that is “bound” to a 
proton by means of the electromagnetic attraction between the oppositely 
charged particles. The study of the hydrogen atom is a very important test 
case in quantum mechanics, and the ability of the Schrédinger equation to 
explain the observed energy levels of hydrogen was a crucial early success 
of the theory. 

A proton is approximately 1,800 times as massive as an electron. Thus, 
to first approximation, we may think of the location of the proton as being 
fixed, with the electron “orbiting” around this location. A more careful 
analysis considers both the proton and the electron as orbiting around 
their center of mass. The Hamiltonian for the relative position of the two 
particles is precisely that of a particle orbiting around a fixed center, except 
that the mass of the electron is replaced by the reduced mass yz of the 
electron—proton system. (See Exercise 1.) Here, as in Proposition 2.16 in 
the classical case, 

MeMp 
PS Fedo 
where m,. and m, are the masses of the proton and electron, respectively. 
Since mp >> me, the reduced mass is nearly the same as the mass of the 
electron. 

After separating out the motion of the center of mass, we are left with 

the following Hamiltonian for the relative position of the electron: 


2 2 
‘ee a (18.7) 
Qu |x| 

where Q is the charge of the electron. (We use a system of units, such 
as “electrostatic” or “Gaussian” units, in which the Coulomb constant is 
equal to 1.) It follows from Theorem 9.38 that Af is self-adjoint on Dom(A) 
and that H is bounded below. 

Note that the classical Hamiltonian H(x, p) for a hydrogen atom is not 
bounded below. After all, we can simply take p = 0 and take x very 
close to the origin. This unboundedness would cause strange behavior for 
a hypothetical classical hydrogen atom. After all, modeling a hydrogen 
atom using the 1/r potential is only an approximation. We are using an 
electrostatic formula for the force, the correct one when the positions of the 
particles are held fixed, in a dynamical situation. A more realistic model 
of hydrogen takes into account radiation, that is, the interaction of the 
charged electron with the electromagnetic fields. Classically, a negatively 
charge particle orbiting a positively charged nucleus would radiate, thus 
giving up energy to the electromagnetic fields. The classical particle would 
spiral rapidly toward the origin, with the particle’s energy going to —oo and 
the energy of the electromagnetic field going to +00. Thus, if hydrogen were 
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made up of classical charged particles, the electron would go into a “death 
spiral” and emit a giant burst of electromagnetic radiation. 

Fortunately for us, this is not how real particles behave! In actuality, the 
electron is a quantum particle. A quantum electron “orbiting” a proton can 
still give up energy to the electromagnetic field. The Hamiltonian for the 
quantum hydrogen atom, however, is bounded below, as a consequence of 
Theorem 9.38. Thus, the electron can only drop to its ground state (the 
state of lowest energy), at which point it becomes stable. 


18.3 The Bound States of the Hydrogen Atom 


Our goal in this section is to find the eigenvectors for the Hamiltonian H 
in (18.7) with negative eigenvalues. Such eigenvectors constitute “bound 
states,” that is, states in which the electron is bound to the proton. For 
each negative number F, we look at the eigenspace Vz for A with eigenvalue 
E, that is, the space of all 7 € Dom(H) satisfying Hy = Ew. Since H is 
self-adjoint and, therefore, closed, this eigenspace will be a closed subspace 
of L?(IR°). Since, also, H commutes with rotations, Vz will be invariant 
under the usual action (Definition 17.1) of SO(3) on L?(R*). Thus, by 
the discussion at the end of Sect. 17.7, Ve decomposes as a direct sum of 
finite-dimensional, irreducible SO(3)-invariant subspaces. 

We now look for such subspaces of Vg. In the following theorem, we 
assume that the radial part of the wave function (the function f in the 
notation V;,, in Definition 17.18) has a certain very special form. After 
analyzing this case, we argue that we have found in this way all of the 
eigenvectors for H with negative eigenvalues. 


Theorem 18.3 For each positive integer n, let 


| een one let (18.8) 


where @ is the charge of the electron and pw is the reduced mass of the 
electron—proton system, and let 


V8H|En| 


pa(xx) = YEE hx), 


Then for each | = 0,1,...,n—1, there exists a polynomial Ly, such that 
for each homogeneous harmonic polynomial q of degree l, the function 


D(x) = g(x)e Pr? nin (x)) (18.9) 


satisfies 
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It follows from Proposition 9.35 that the functions 7 in (18.9) belong to 
Dom(A) and thus, by Theorem 9.38, to Dom(#). a polynomials L,, ; are 
the Laguerre polynomials. The coefficient of —1/n? in the formula (18. 8) 
for E,, is the Rydberg constant (compare Sect. 1.2.1). 

Let us see how to connect Theorem 18.3 to the usual expression for 
the hydrogen atom eigenvectors in the physics literature. In the first place, 
physicists choose a certain basis qim for the space of harmonic polynomials, 
which is—up to normalization constants—the basis in Theorem 17.4. In the 
second place, physicists write the solutions in spherical coordinates. When 
changing to spherical coordinates, we should keep in mind that qm, is 
homogeneous of degree | and that p,,(x) is just a constant multiple of the 
distance from the origin. We obtain, then, the following expression: 


Yn, l walls 0 ,o) = Yi, m(9, )ple-P"!* Ln i(pn); (18.10) 


where Yim(0, ¢) is the restriction to the unit sphere of pj.m. 

Proof. If E is a negative real number, we look for solutions to A w= Ew 
of the form q(x) f(|x|), where q € Vj. Provided that f(r) and f’(r) are 
bounded near the origin, Proposition 9.35 allows us to compute Aw on 
R°\{0} without worrying about whether ~ is differentiable at the origin. 
Using Proposition 18.1, the equation for f is 


2 2 





For large r, where the two terms that involve a factor of 1/r become neg- 
ligible, and so 
i df 
- Se 
Recalling that E is negative, (18.12) tells us that near infinity, f should 
behave like a combination of a growing and a decaying exponential. Since 
we want square-integrable solutions, we require that only the exponentially 
decaying term be present. 
We therefore postulate a solution of the form 


f(r) = exp {aa g(r), (18.13) 


~ Ef. (18.12) 


h 


for some function g. If we plug (18.13) into (18.11) for f, there are canceling 
terms equal to Eg(r) on each side, leaving 


dg ../2ulEldg  2(1+1)dg 2(1+1) /2u]E| 
soe g(r) 
dr h dr r dr r h 
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We now introduce the new variable p = (\/8u|E|/h)r. After making this 
change of variable, we find that each term in square brackets obtains a 
factor of 8|E| /h?, so that our equation becomes 





h? 8u|E| [d?g dg , 2(l+1)dg_ (1+1) () _ 24/2u|E| Q? eo 
2u fh? |dp? dp p dp pe a ge 


Multiplying through by p and simplifying yields the equation. 








d’g dg dg Ql 
—> —p—+2(14+1 Led = 0. 18.14 
Pat Pap ( Vp i?) (1+ 1)} g(e) (18.14) 


If we postulate for g a power series }>7° ap", we obtain the following 
recurrence relations for the coefficients: 


_ [ktl+1—H 
she OF ee 1) Oy 





(18.15) 


where 
y= Sv 
hy/2\E]. 
The series for g will terminate, yielding a polynomial solution to (18.14), 


provided that A is an integer n with n >1-+ 1. We can then solve for the 
energy in terms of n as follows: 


Recalling that F is negative, we have obtained the desired form for the 
energy levels. Furthermore, the condition n > 1+1 is the same as! < n—1. 
Finally, if we plug in the formula for p in terms of r and the formula for f 
in terms of g, we obtain the form of the solution stated in the theorem. m 

It is important to emphasize that the functions in Theorem 18.3 do not 
span the entire Hilbert space L?(R?). After all, these functions are all eigen- 
vectors for H with negative eigenvalues. If these vectors spanned L?(R*), 
then the expectation value of the energy would always be negative. But it 
is easy to produce functions 7 in the domain of H for which (7, Hw) > 0. 
Simply take ~ to be a Gaussian wave packet with mean position far from 
the origin and with very large mean momentum. Then (wW,Vw) will be 
close to zero but (a, P?7) will be large and positive. Nevertheless, it can 
be shown that the functions in Theorem 18.3 span the negative energy 
subspace of L?(IR°). It is possible to analyze also the positive part of the 
spectrum of H, but the spectrum above zero is purely continuous and rep- 
resents a hydrogen atom that has ionized, that is, in which the electron has 
escaped from the proton. 
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Theorem 18.4 As n varies over all positive integers, | varies from 0 to 
n—1, and g varies over all homogeneous harmonic polynomials of degree 
l, the eigenvectors in Theorem 18.3 span the negative-energy subspace of 
L?(R?), that is, the range of the projection u™ ((—00,0)), where pw” is the 
projection-valued measure associated to H by the spectral theorem. 


Proof. The proof requires results from spectral theory that go beyond the 
machinery that we have developed in Chaps. 9 and 10, and which we cannot 
reproduce in full here. Specifically, we make use of Theorem V.5.7 of [27], 
which tells us that the negative-energy portion of the spectrum of H is 
discrete, consisting of eigenvalues of finite multiplicity accumulating only 
at zero. 

We indicate briefly why the above result holds. If A and B are unbounded 
self-adjoint operators, let us say that B is a relatively compact perturbation 
of A if A(B — \J)~! is a compact operator for every \ in the resolvent set 
of B. According to Lemma V.5.8 of [27], the potential energy operator 
for the hydrogen atom is a relatively compact perturbation of the kinetic 
energy operator. This is a strengthening of what we showed in the proof 
of Theorem 9.38, namely that the potential energy operator is relatively 
bounded with respect to the kinetic energy operator, with relative bound 
less than 1. The proof of relative compactness relies on the fact that the 
potential for the hydrogen atom goes to zero at infinity. 

Meanwhile, let us say that A belongs to the essential spectrum of an un- 
bounded self-adjoint operator A if either \ is a nonisolated point in o(A) 
or A is an eigenvalue for A with infinite multiplicity. According to The- 
orem IV.5.35 of [27], a relatively compact perturbation of a self-adjoint 
operator does not change the essential spectrum. Thus, the essential spec- 
trum of H is equal to the essential spectrum of the kinetic energy operator, 
which is certainly contained in [0, 00), since the kinetic energy operator is 
non-negative. It follows that any point in the negative-energy part of the 
spectrum of H must be an isolated point in o(H) and an eigenvalue of 
finite multiplicity. 

In light of the preceding result, there is no continuous spectrum for H 
below zero, and we need only look for square-integrable eigenvectors. Since, 
also, each eigenspace for H with eigenvalue E < 0 is finite dimensional, it 
will decompose as a direct sum of irreducible, SO(3)-invariant subspaces. 
Such subspaces, according to Proposition 17.19, are always of the form Vi, 
for some / and f, where V;,¢ is as in Definition 17.18. Thus, we look for 
functions w of the form (x) = p(x)f (|x|) such that Hy = Ey for some 
E <0. 

Now, if a function of the form p(x)f(|x|) is to be an eigenfunction of 
the Hamiltonian, f must satisfy the differential equation (18.11). By ele- 
mentary results from the theory of linear ordinary differential equations, 
this equation has precisely two linearly independent solutions, for any value 
of EF. Both solutions can be constructed by postulating a solution of the 
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form (18.13), introducing the new variable p, and then using a power series 
expansion for g(p) (Exercise 9). One of the solutions for g(p) will have a 
power series starting with p~°!+1), in which case 7(x) will blow up like 
1 |x|UFD near the origin; such a function is not in the domain of the Hamil- 
tonian (Exercise 14 in Chap. 9). The other solution for g(p) will start with 
p° and may be obtained by using the form (18.13), changing from the vari- 
able r to the variable p, and then using the recurrence relation (18.15) to 
define the coefficients of a power series. If the resulting series does not ter- 
minate, it is not hard to see that the terms will behave for large k like the 
series for e?. Since the function f is equal to e~?/2g(p), this function will 
erow like e?/? near infinity, which means that w will not be in L?(R*). Thus, 
to get a square-integrable solution, the series for g() must terminate, in 
which case w is one of the functions in Theorem 18.3. 


Corollary 18.5 Each eigenvalue E,,, as given in Theorem 18.3, has mul- 
tiplicity n?. 


Proof. According to Theorem 18.4, the eigenvectors in Theorem 18.3 con- 
stitute all of the eigenvectors for H with eigenvalue E,. The number of 
independent eigenvectors with eigenvalue EF, is thus the sum of the dimen- 
sions of the spaces V; of spherical harmonics, with | = 0,1,...,n—1. This 
number is, by Theorem 17.12, 


as claimed. @ 


18.4 The Runge—Lenz Vector in the Quantum 
Kepler Problem 


In Sect. 2.6, we showed that the classical Kepler problem can be solved 
almost completely by making use of the Runge—Lenz vector, which is a con- 
served quantity. The quantum version of the Runge—Lenz vector commutes 
with the Hamiltonian and can elucidate a number of special properties of 
the quantum Kepler problem, which we typically think of as describing a 
hydrogen atom. In particular, the Runge—Lenz vector will help to explain 
(1) the simple form —R/n? of the negative energies of the hydrogen atom 
and (2) the apparent coincidence by which energy of the states in (18.9) 
is independent of / for a given n. Note that the rotational symmetry of 
the problem explains why the energy of the states in (18.9) is indepen- 
dent of the choice of the harmonic polynomial g. Nevertheless, rotational 
symmetry cannot explain why states for different values of /—and thus dif- 
ferent radial dependence in the wave function—have the same energy. This 
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apparent coincidence will be explained by an additional symmetry of the 
problem, that is expressible in terms of the Runge—Lenz vector. See also 
Sect. 7 of [17] for a somewhat different (but related) explanation for the 
structure of the eigenvalues of the hydrogen atom and their multiplicities. 

There are several computations involving the Runge—Lenz vector that, 
while elementary, are laborious. Those computations are deferred to 
Sect. 18.6. 


18.4.1 Some Notation 


To keep the notation as simple as possible, we will adopt in this section 
Einstein’s summation convention, which states that repeated indices are 
always summed on, even if there is no summation sign written. In this 
section, the sum will always range from 1 to 3. Using this convention, we 
write, say, the dot product of two vectors u,v in R® as u- v = ujv;,where 
the summation convention frees us from having to write out explicitly the 
sum over j. 

We will make frequent use of the totally antisymmetric symbol €jx1, where 
j, k, and | range from 1 to 3, defined as follows, 


Definition 18.6 For j,k, € {1,2,3}, define ej, by the formula 


1 if (j,k,l) is an even permutation of (1, 2,3) 
Eyal = § —1 if (9, k,l) is an odd permutation of (1, 2,3) 
0 «af any two of j,k,l are equal 


Thus, for example, €321 = —1 and €212 = 0. The commutation relations 
for the basis {F\, Fy, F3} for so(3) may be written (using the summation 
convention!) as 


[Fj Pe] = ejniF i. (18.16) 


For instance, if we take 7 = 1 and k = 2 in (18.16), then the sum on / gives 
a nonzero value only when / = 3, and we recover the relation [F), Fb] = F3. 


18.4.2 The Classical Runge-Lenz Vector, Revisited 


We have already introduced, in Sect.2.6, the Runge—Lenz vector A in the 
classical mechanics of a particle moving in a 1/r potential. We require a few 
more properties of A before turning to the quantum version. We consider 
a classical particle in R? with Hamiltonian given by 

Ip? Q? 

H(x,p) = —— - —. (18.17) 

2u |x| 
This is just the Hamiltonian for the classical Kepler problem, except that 
we replace the mass m of the planet by the reduced mass p of the electron— 
proton system, and we replace the constant k := mMG by Q?. 
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For the Hamiltonian in (18.17), the Runge-Lenz vector is given by the 
formula 


A(x,p) = —ypxJ-2, 

[x 
where J := x X p is the angular momentum. By Proposition 2.34, the 
Runge—Lenz vector is a conserved quantity for the classical Kepler prob- 
lem, in addition to H and J, which are conserved quantities for any radial 
potential. By results of Sect. 2.6, we have the following relations among 
these conserved quantities: 


A-J=0 


JAI’ =1+ Tabs. 


Lemma 18.7 The Runge—Lenz vector A and the Hamiltonian H in (18.17) 
satisfy the following Poisson bracket relations: 


{A;, H} =0 


{Aj, Am} = ej Jil (18.18) 


- 

We have already shown that the Runge—Lenz vector is a conserved quan- 
tity (Proposition 2.34), which is equivalent (Proposition 2.25) to saying that 
the Poisson bracket of A; with H is zero, as claimed. The proof of (18.18) 
is deferred to Sect. 18.6. We now introduce certain combinations of the 
Runge-Lenz vector, the angular momentum, and the Hamiltonian that 
form a Lie algebra under the Poisson bracket. In the construction of these 
functions, we need to take a square root of the Hamiltonian, which necessi- 
tates separating the positive-energy and negative-energy parts of the phase 
space. Our interest is primarily in the negative-energy case. 


Definition 18.8 Let U~ denote the negative-energy part of the classical 
phase space, 
U~ = { (x,p) € R®| H(x,p) < 0}. 


Consider on U~ the normalized Runge—Lenz vector B given by 


pd 
2|A| 





Define also vector-valued functions I and K on U~ by 


io K-28 
2 2 
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Theorem 18.9 The functions I and K Poisson-commute with the Hamil- 
tonian and satisfy the following Poisson-bracket relations on the negative- 
energy set U~: 


{1j, 4k} = geri 
{Kj, Ke} = ejnik 
{Ij, Ke} = 0. 


The functions I and K also satisfy the following algebraic relations: 


4 
1? =|K)? = £2. 


In Theorem 18.9, we use the summation convention introduced in the 
previous subsection. The proof of this theorem is elementary but rather 
laborious, and is deferred to Sect. 18.6. 

The span of the functions 1), l2,J3 and K,,K2,K3 on U~, which is 
the same as the span of the functions B,, Bo, B3 and J, Jo, J3, forms a 
6-dimensional Lie algebra under the Poisson bracket. Comparing the Poisson- 
bracket relations among the I’s and among the K’s to the relations among 
the basis elements F, Fb, F3 for so(3), we see that the span of the I’s and 
the span of the K’s are both isomorphic to so(3) [or, if you prefer, to su(2)]. 
Since also each J; commutes with each K;, the 6-dimensional Lie algebra 
spanned by the J’s and the K’s is isomorphic to so(3) @ so(3). Meanwhile, 
as demonstrated in Exercise 4, so(3)@so(3) is isomorphic to the Lie algebra 
so(4). Since all the /’s and K’s Poisson-commute with the Hamiltonian, we 
say that the Kepler problem has so(4) symmetry. This is in contrast to the 
dynamics of a particle moving in R® in the force generated by a typical 
radial potential, which has only so(3) symmetry. 

To be more precise, “so(4) symmetry” prevails only on the negative- 
energy subset U~ of the classical phase space. On the positive-energy subset 
U*, the span of the functions B,,B2,B3 and J,, Jo, J3 again forms a 6- 
dimensional Lie algebra. This Lie algebra, however, is not isomorphic to 
so(4), but rather to so(3, 1), where so(3, 1) is the Lie algebra of the group of 
4x4 matrices that preserve the quadratic form x? +23 +a3—«j. The reason 
the formulas on U* are different from those on U7 is that calculations of 
the relevant Poisson brackets involves the function H/|H|, which has the 
value 1 on Ut and the value —1 on U~. (The factor of H comes from 
Lemma 18.7 and the factor of |H| from the factor of \//H] in the definition 
of B.) 


18.4.8 The Quantum Runge-Lenz Vector 


We now introduce the quantum counterpart A of the classical Runge—Lenz 
vector A. The quantum Runge—Lenz satisfies most of the same properties 
as the classical version, with a few small but crucial “quantum corrections.” 
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Definition 18.10 Define the quantum Runge—Lenz vector by 
x 


Note that in the quantum case, —J xP is not the same as P x J, because of 
the noncommutativity of the factors. The particular combination of P x J 
and J x P in Definition 18.10 is used because it is yields a self-adjoint 
operator. The Runge—Lenz vector can also be computed as 


xX 
A= 


a ( x 
Pxj- inP) es (18.19) 
uQ? |X| 
as will be verified in Sect. 18.6. 
In the interests of keeping the exposition manageable, we will not concern 
ourselves in what follows with determining the precise domains on which 


various identities hold. 


Proposition 18.11 The quantum Runge-Lenz vector A satisfies the fol- 
lowing relations: 


an se 
Aa1+7 (5-5+H). (18.20) 


Note that there is a “quantum correction” in (18.20); the factor of J- J 
in the classical expression for A- A is replaced by J-J+h?. This correction 
gives rise to a quantum correction in (18.22), which in turn is essential 
to getting the correct value for the energy eigenvalues in Corollary 18.17. 
The proof of this result and the other results of this section are deferred to 
Sect. 18.6. 


Lemma 18.12 The quantum Runge—Lenz vector A and the Hamiltonian 
Ff satisfy the following commutation relations: 


1... « 
1. . 
RAs m] = — aio dt (18.21) 


Note that since H commutes with rotations, it commutes with the angu- 
lar momentum operators Ji. Thus, in (18. 21), we could just as well write 
H J in place of Ji. As in the classical case, if we normalize the com- 
ponents of the Runge—Lenz vector by dividing by the square root of the 
Hamiltonian, then these operators together with the angular momentum 
operators form a 6-dimensional Lie algebra. 
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Definition 18.13 Let V~ denote the negative-energy subspace of L?(R°), 
that is, the range of the spectral projection p"((—oo,0)). Let |H| denote 
the restriction to V— of the operator —H. On V—, define operators B by 


Define also operators I and K, as in the classical case, by 


é i ga 
I= ——; K=-—_. 
2 2 

It is possible to define the absolute value of any self-adjoint operator 
by means of the functional calculus. However, since the restriction of H 
to V~ is, by definition, negative definite, the restriction of |H| to V~ co- 
incides with the restriction to V~ of —H. The operator 1/,/|H| is the 
operator with a restriction to the energy eigenspace with eigenvalue EF, 


that is 1/,/|E,|I. The components of B are unbounded operators, defined 
on suitable dense subspaces of the Hilbert space V~. 


Theorem 18.14 The operators I and K commute with the Hamiltonian 
FT and satisfy the following commutation relations: 


Auli tel = ejuli 
A Rees * r 
ti Kel = ej Ky 
1 Z 
pls Kel = 0 


These operators also satisfy the following algebraic relations: 


4 2 
Pi-R- Roo 7. (18.22) 


18.4.4 Representations of so(4) 


In light of the commutation relations in Theorem 18.14, we can define a 
representation 7 of the Lie algebra so(4) = so(3) © so(3) on the negative- 
energy subspace V~ as follows: 
1; 1. 
1(F;,0) = Pree n(0, Fi) = aor (18.23) 
It is therefore desirable to classify the irreducible finite-dimensional repre- 
sentations of so(3) @ so(3), which we do in the following proposition. 
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Proposition 18.15 Suppose Vz, and V; are irreducible representations of 
so(3) of dimensions 2k+1 and 21+1, respectively. Then V,@V; is irreducible 
when viewed as a representation of so(3) @ so(3) as in Remark 16.49. Fur- 
thermore, every irreducible finite-dimensional representation of so(3)@so(3) 
is isomorphic to Vy @ V; for a unique ordered pair (k,l). 

For any representation V;,®V, of so(3) @so(3), define Casimir operators 
Cy and Cz by the formula 


Then we have 
Ci =-k(k+1)I; Cy=-—l(l+1)I. 


Proof. To classify the irreducible representations of so(3) @so(3), we could 
appeal to the general theory of representations of direct sums of Lie alge- 
bras. It is not hard, however, to give a direct proof using the same sort 
of reasoning we used in the classifications of irreducible representations 
of so(3). We will omit the details of this computation. The result on the 
Casimir operators follows easily from Proposition 17.8. 

In any finite-dimensional subspace of V~ that is invariant and irreducible 
under the action of so(3) @so(3) in (18.23), the Casimir operators are given 
by C, = —1-1/h? and C = —K-K/h?. Since, by Theorem 18.14, l-I = K-K 
on V—, all of the irreducible representations of so(3)@so(3) that arise inside 
V- will be of the form Vz ® Vp. 


Theorem 18.16 Let W'”) denote the eigenspace for the Hamiltonian with 
eigenvalue E,. Then W\) is invariant and irreducible under the action of 
so(3) @ so(3) in (18.23). More specifically, we have the isomorphism 


W™ Vy, @ Vi, 


as representations of so(3) ® so(3), where k = (n —1)/2 and where Vj, is 
the irreducible representation of so(3) of dimension 2k +1=n. 


Corollary 18.17 If n, k, and W™) are as in Theorem 18.16, then for all 
wew™, we have 


1-1 =J- db = Fk(k +1). 
Using (18.22), the eigenvalue Ey, of H on W™) can be solved for as 


uQ* Q? 


8h? (k + 1)? Zhen” 








n= 


The expression for E,, in Corollary 18.17 is the same as in Theorem 18.3. 
The remarkable thing about the proof of Theorem 18.17 is that it is purely 
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algebraic, relying only on the commutation relations among the operators 
ii and Ki, along with the relationship (18.22) between the Hamiltonian 
operator H and the [;,’s and K7’s. 

Proof of Corollary 18.17. It is easily seen that the operators I- I and 
K -K, when restricted to an irreducible subspace for the action of so(3) @ 
so(3), are equal to —h?C, and —h?C2, where C, and C2 are the Casimir 
operators appearing in Proposition 18.15. Thus, if W is isomorphic to 
Vi. ® Vz, with k = (n—1)/2, then 1-1 and K-K will be equal to A?k(k+1)I, 
as claimed. On the other hand, 1-I and K-K are related to the Hamileceian 
FH by (18.22), from which we can solve for En. ™ 

Proof of Theorem 18.16. Since each component of A and J commutes 
with H, each component of I and K will also commute with H. Each 
eigenspace of H is therefore invariant under the action of I and K. Since 
the I’s and K’s are self-adjoint and W(”) is finite dimensional, W‘™ will 
decompose as a direct sum of irreducible invariant subspaces. By Proposi- 
tion 18.15, these irreducible subspaces will be of the form V; ® V;, where 
V; and V; are irreducible representations of so(3) of dimension 2k + 1 and 
21+ 1, respectively. But now, the operators I-landK- K, when restricted 
to one of the irreducible subspaces of W'”), are equal to —h2C and —h?Cd, 
where C; and C2 are the Casimir operators appearing in Proposition 18.15. 
Since I-f = K-K on all of V—, the eigenvalues of C, and C2 must be equal 
on each irreducible subspace of W‘”). Thus, we must have k = 1, meaning 
that only irreducible subspaces of the form Vz, ® Vz arise. 

Now, under the isomorphism of some irreducible subspace of WwW with 
V;, ® Vz, the operators iF ; and K , act as 7AF, @I and thl @ Fr, respectively, 
where the F;’s are the usual basis for so(3). Since J = 1+ K, each J, acts 
as ih(F, ® I+1® F,). This means that V, @ Vz, under the action of the 
J's, can be thought of as a tensor product of two representations of so(3), 
viewed as another representation of so(3) as in Definition 16.48. Viewed 
this way, V;, @ V; decomposes as in Proposition 17.23 as 





Vi @ Ve = Vo OVi @--- @ Var. (18.24) 


On the other hand, we know from Theorem 18.3 that W‘") decomposes 
under the action of so(3) as 


Wav @-@ Vax: (18.25) 


Thus, the space of the form V, ® Vz must be all of W\™: if there were 
another term then the trivial representation Vj would occur more than 
once in W(). This being the case, matching the decompositions (18.24) 
and (18.25) requires that 2k = n— 1, as claimed in the theorem. 

The proof of Theorem 18.16 relies to some extent on the results of 
Sect. 18.3. Using only algebraic manipulations involving the Runge-Lenz 
vector, however, we could still argue that the eigenvalues of H must be of 
the form given in Corollary 18.17. We would not, however, know that for 
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every positive integer n, the number EF, is actually an eigenvalue for H. 
We would also not know that each eigenspace W” is irreducible under the 
action of so(4); conceivably, based only on the algebra, W'”) could have, 
say, dimension 2n? instead of n?. 


18.5 The Role of Spin 


The spin of the electron is 1/2. As discussed in Sect. 17.8, this means 
that the Hilbert space for an electron is L?(R3)@Vi jo, where Vj/2 is a 
2-dimensional vector space that carries an irreducible projective unitary 
representation of SO(3). Up to now, we have neglected the spin in our 
calculations. The reason for this omission is simple: to first approximation, 
the spin plays no role in the calculation. Specifically, in the simplest model 
of a hydrogen atom with spin, the Hamiltonian is simply Hel , where H 
is the operator in (18.7), acting on L?(R?). For any n > 0, we can obtain a 
basis of eigenvectors for H @ I with eigenvalue E,, by taking vectors of the 
form Wn,im ®& ej, where the Wn,1m’s are as in (18.10) and where {e1, e2} 
forms a basis for Vj /2. 

Now, from the point of view of rotational symmetry, the basis wp,1,m @ €; 
is not the most natural one. Rather, we should decompose the eigenspaces 
into irreducible invariant subspaces for the (projective) action of SO(3), 
where SO(3) acts on both L?(R*) and Vj 2. We have already decomposed 
the eigenspaces inside L?(IR*) into irreducible invariant subspaces, namely 
the span of Wn1.m Where n and J are fixed and m varies. Thus, to obtain 
the irreducible invariant subspaces inside L?(R*)®Vj /2, we use the method 
of “addition of angular momentum” from Sect. 17.9. According to Proposi- 
tion 17.22, Vi@ Viz is irreducible if | = 0 and isomorphic to Vi41/2® Vi-1/2 
if 1 > 0. Consider, for example, the case n = 3, | = 1, the so-called “3p 
states” in traditional chemistry terminology. Since V; @ Vj/z decomposes 
as V3/2 ® Vi/2, when we take spin into account, we obtain a 4-dimensional 
space and a 2-dimensional space. We can obtain bases for these spaces by 
tracing through the proof of Proposition 17.22. 

The decomposition described in the previous paragraph is essential when 
considering the “fine structure” of hydrogen. Our model of hydrogen using 
the Hamiltonian (18.7) is only a first approximation. More realistic mod- 
els take into account various corrections, including radiative corrections, a 
finite size for the nucleus, and “spin-orbit coupling,” among other things. 
The notion of spin-orbit coupling adds a term into the Hamiltonian involv- 
ing the operator J +o, where 01, 02, and 03 are the operators describing 
the action of so(3) on Vj/2. When this term is included, the Hamiltonian 
is no longer of the form A ® I for some operator A on L?(R?). Thus, we 
can no longer simply append the spin to the end of the computation, but 
must take it into account from the beginning. 
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The various corrections to the Hamiltonian for the hydrogen atom have 
the effect of reducing the multiplicities of the eigenvalues. Almost any cor- 
rection we make, for example, will destroy the independence of the eigen- 
value on I for a given n, simply because the correction terms in the Hamilto- 
nian will not commute with the quantum Runge—Lenz vector. Nevertheless, 
all of the corrections that make up the fine structure of hydrogen preserve 
the rotational symmetry of the problem. Thus, the same irreducible repre- 
sentations of SO(3) that we had in the simple model will appear after the 
corrections are made. For n = 2, 1 = 1, for example, we will still have a 
4-dimensional space and 2-dimensional space, but these two spaces will no 
longer have the same energy. 
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In this section, we fill in many of the computations that we passed over 
without proof in Sect. 18.4. Although all the calculations are, in principle, 
elementary, there are a number of nonobvious tricks that help simplify 
the algebra. We will make frequent use of the concepts of functions that 
transform like vectors (on the classical side) and of vector operators (on 
the quantum side), including Propositions 17.25 and 17.27 (Sect. 17.10). 
In particular, we note that the position x, the momentum p, the angular 
momentum j, and the Runge—Lenz vector A all transform like vectors, 
and that the corresponding quantum quantities are all vector operators. 
(Compare Exercise 7.) In the “e” notation of Sect. 18.4.1, Proposition 17.27 


takes the form 1 


ih 

In the quantum mechanical calculations, there are a number of “quantum 

corrections,” in which dot products and cross products of vector operators 
do not behave as they do in the classical case. 


“i i ee 
[C;, Jr = =[J;, Cr] = EjRICl. (18.26) 
ih 


Lemma 18.18 The ¢-function in Definition 18.6 satisfies the relations 
EjklEjmn = SkmOin _ SknOlm 
EjklEjkm = 2651m- 
The proof of these results is not difficult and is left to the reader (Ex- 


ercise 6). The following identities involving the cross product of vector 
operators will be useful to us. 


Lemma 18.19 If C, D, and E are arbitrary vector operators, we have 
C-(Dx E)=(CxD)-E (18.27) 
CxD+D~x C = ejx1(Cz, Di] (18.28) 


1 
CxC= Beant [Ce Ci]. (18.29) 
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In particular, if the different components of C commute, then C x C = 0. 
Finally, 


(C x (D x E)); = Cy. Dj Ex = C.D Ej. (18.30) 
As special cases of these results, we have 


JxP+Px Jj =2ihP (18.31) 
Ix J=ini (18.32) 


Note that if the entries of D and E commute, then the right-hand side 

of (18.30) reduces to the classical expression, (C - E)D — (C- D)E. Us- 
ing (18.31), we can easily verify the alternative expression (18.19) for the 
Runge—Lenz vector. 
Proof. The right-hand side of (18.27) is computed as €54;C;,D,E;. If we 
note that €j4. = €x1j and then relabel the indices, we obtain €j417C; D, Fi, 
which is equal to the left-hand side of (18.27). For (18.28), we compute 
that 


(Cx D+D x C)j = €jn1CeDi + € jx DECi 
= ejntCe Di + Ejx1Ci De — Ejxi[Ci, Dr. (18.33) 


If we note that €j;x, = —e€j1, and then relabel the indices k and 1, we see 
that ejn1CiDr = —ejr1CzDi, so that the first two terms in the second line 
of (18.33) cancel. The remaining term can be put into the claimed form by 
relabeling the indices k& and |. The identity (18.29) is just the D = C case 
of (18.28). Finally, (18.30) follows easily from Lemma 18.18. 

To obtain (18.31) and (18.32), we apply (18.28) and (18.29), respectively. 
Since both J and P are vector operators, the desired result follows easily 
from Lemma 18.18. 

We now turn to the proofs of the results of Sect. 18.4. We prove only the 
quantum versions of the results, since the classical results are extremely 
similar, except that certain quantum corrections can be ignored. 

Proof of Lemma 18.12, First Part. We begin by showing that A; 


commutes with H for each j. Since H commutes with J , we have 


[A;, A] = aS (cant Pe, HV - JelPi. Al) - Fag . 


Meanwhile, since the P’s commute among themselves, we have 


1 
|X| 


2 Xk 
x) 





[Pe; H] = -Q? . | = ihQ 
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Thus, 


rn : X 
Sant Pe, Ase = the 01m oe XmPr 
X 
= IRQ? (Sjmdkn a CP ee. aan 
|X| 
= a 3 (X,XjP, — XmXmP;) 


= —ihQ? —~ rT at X;(X-P)-—(X-X)P;). (18.34) 

We compute Eiki [P, H] in a similar way. Note that Jy = €kmnXmPn = 

EkmnPnXm, since X,, and P, commute except when m = n, in which case 
Ekmn = 0. The result is 


7 ‘ 1 
ejktJk| Pi, H] = —ih(P. i (X- X)- (P-X)XG) So 
Meanwhile, since the X’s commute among themselves, we have 
X; 
A 
ix 
7 E P : 
|X|" 2u 
ise, 


Ee 1 
= Pr +5 iil E | 
= ale? | |X|’ 
ih 2G; ih 1 Je 
= Oik u Py es wet jk ! 
2h (x an dy ) ' e Re 


th 1 X; th 1 X; 
2 Po xp) |S | Be Xi |. 8) 
ma 7 ox ) a 7X xP 


It is now a simple matter to compute [A;, ] by combining (18.34) and 
(18.35) and verify that everything cancels. We have, for example, a term 
involving (X;/|X|°)(X - P) in (18.34) and a canceling term in (18.35). 
Before proceeding with the remaining results concerning the Runge-Lenz 
vector, we verify some results that will be needed later. There are some 
quantum corrections compared to the corresponding classical results. 











Lemma 18.20 As in the classical case, the following “orthogonality” re- 
lations among vector operators hold: 


J-P=P-.J=0 (18.36) 
J-xX=xX-J=0 (18.37) 
(Px J)-J=J5.(Px J) =0. (18.38) 
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Meanwhile, there is a quantum correction in the dot product between P and 
P x J, as follows: 


P.(PxJ)=0 (18.39) 
(P x J)-P =2ih(P-P). (18.40) 
Finally, we have 
(P x J)- (Px J) =(P-P)(I- 5) (18.41) 
xX-(PxJ=5-] (18.42) 
(Px J)-X=J-.342iAP-X. (18.43) 
Proof. By (18.27) and (18.29), we have 


j-P=(XxP)-P=X-(PxP)=0, 


since the ai components of P commute. The same reasoning shows 
that P- J, J-X, and X-J are all zero. To compute (P x J) -J, we first 
use ee then use (18.32), and then use that P- J = 0. For J-(P x J), 
we rewrite P x J in terms of J x P, using (18.31). The correction term 
involves P, which has a dot product of zero with J , and so the answer is 
again zero. 

We use (18.27) and (18.29) again to establish (18.39). To get (18.40), we 
first rewrite P x J in terms of J x P using (18.31) and then apply (18.39). 
To establish (18.41), we apply (18.27) and then (18.30), giving 


(Px J)-(P x5) =PB AP Je — Pj Je Prd. (18.44) 


The second term on the right-hand side of (18.44) is zero because J-P = 0. 
For the first term, we move J, to the right past P;. This generates the term 
we want plus a correction term equal to ihe, Pj PJ x. The correction term is 
zero because P; and P; commute and €,;) is changes sign under interchange 
of j and |. The identity (18.42) follows immediately from (18.27) and the 
definition of J. The identity (18.43) follows from (18.27) and (18.28). — 


Lemma 18.21 For all j and m, we have 
[(P x J);,(P x Dm] = —ih(P + P)e jms. 


Proof. In computing [P;.J;, Py Jo], we use repeatedly the product rule for 
commutators (Point 3 of Proposition 3.15). We obtain four terms, one of 
which is zero (the term involving [P, Pn]). We use Proposition 17.27 (in 
the form (18.26)) to evaluate all remaining terms, giving 


1 i“ 2 
ih [ejnt Pudi, Emnolndol 


= eye ane (Peli, P 7. Pe Pde BPPe a, li) . (18.45) 
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Let us compute the first of the three terms on the right-hand side of (18.45). 
Using Lemma 18.18 and the fact that P is a vector operator, we get 
EjklEmnoLk (Ai, Pi. = EjkL (Sopdml _ 5ot5mp) PrP Jo 
oy ee a €jkoPrPmdJo 
= EjkmPr(P . J) = Pr(P x J)j. 


If we compute the second and third terms similarly, we obtain 


1 re zn a ma 
Fy eam Pe St, EmnoPrJo| = EjkmPe(P J) = Pm(P x J); 
+ (P x P)jJm — €jkmPe(P + 3S) + Pm(P x 5); — (P+ Pyejmidt. 


Three of the above terms are zero (those involving P- J or P x P) and two 
other terms cancel, leaving us with 


sles Pech, emnoPaIe = —(P F P)éejmiJi, 
as claimed. m 

We now continue with the proof of the properties of the Runge—Lenz 
vector. 
Proof Proposition 18.11. From the first set of orthogonality relations in 
Lemma 18.20, we can see easily that j-A=A-J=0. Meanwhile, using 
the expression (18.19) for A and expanding out A - A yields, after a little 
simplification, 











—— 1 a 4 2 
Ahad aol? P) (5-5 +1?) 
i a x xX 
2j-I— +in(P- > )) 
uQ? ( |X| |X] |X| 
Now 
ze x bee. Ree. nl 
-P-—P ih = 2h. 
|X| |X| (% |X|? |X| |X| 
Thus, 





A Anis (G48) 2, (22 et), 
A-A=14 (5-5) +7?) ( 7 ex 

as claimed. @ 

Proof of Lemma 18.12, Second Part. We write A in the form given 
n (18.19). In computing the commutator of A; with Am, we get several 
different types of terms, which we compute one at a time. Of course, the 
commutator of X;/ |X| with X;/ |X| is zero. The commutator of the P x J 
terms has been computed in Lemma 18.21. 


18.6 Runge-Lenz Calculations 415 


Meanwhile, to compute the commutator of P,J; with X,(1/|X|), we 
again get four terms and, again, one of these is zero, namely the one in- 
volving {J7,1/ |X|}, since 1/ |X| is invariant under rotations. We have, then, 


1 i 1 
ih ent Fie dts Xm ET 
~ 1 i 1]. 
= €9nt[Pes Xm| Jie + €5ntPr[J, Xulyg + €541Xm | Pr, = x] J 
-~ 1 1 X 
—~oFR é} mls jk; mnPeXn Xm— 7 noXn Pe: 
EjkLOk TX] + EjRIEL k x] + EjkI ix/ ZEl 


If we apply Lemma 18.18 and carry out some computations similar to ones 
we have already performed, we obtain 








it i 1 ra 1 
= ep Padi Xo = —Ejmid, bath <%= 
ap | EaREA Bee al Ejml 1X] + djm( Xj 
1 xX. x 
XX;—~(X-P Pr—+—P,;}. 18.4 
+ Xo XG re iXP) (Png + Be?) aes, 


In a commutator of the form [a; + 8;,%m +m], the terms involving the 
commutator of an a with a 6 will be [a;, 8m] + [8;,m], which is equal 
to [a;, 8m] — [am, Bj]. This quantity is skew-symmetric j with m, meaning 
that it changes sign when we interchange j with m. Thus, terms in (18.46) 
that are symmetric in 7 and m will disappear when we compute the full 
commutator of A; with A,,. Thus, the second and third terms in (18.46) 
can be ignored. In the last term, we can commute P,, past X; to obtain 


+ —P; = —Pn+— 
IX] UXT? EX] IX]? 





Xj en X; An cae (2 Xj Xm 


which is also symmetric. Thus, only the first term in (18.46) contributes to 
the computation of [A;, Am]. This term is skew-symmetric in j and m and 
will be doubled when we compute [Aj, Ava). 

Now, it is straightforward to compute [ejx:PxJ1, Pn] and [P;, Xm/ |X|] 
and to verify that these commutators are symmetric in j and m (Exercise 8) 
and therefore do not contribute to the computation of [Aj, Ay. We are left, 
then, with the following 


te 
— As An| === Se (P Pid ad 
rl | ] no j ( ) i+ oi gml x 
a(S P -<). 
= wD Eim 
Te re" he 


which is what is claimed in the lemma. 
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Proof of Theorem 18.14. Since the Hamiltonian H is invariant under 
rotations, H commutes with each component of the angular momentum. 
We have also established that H commutes with each component of the 
Runge-Lenz vector. From this it follows easily that I and K commute with 
the Hamiltonian. 
Since A, commutes with H, it also commutes with any function of H. 

It then follows from Lemma 18.12 that 

Toe oe Hi oe Oe 

GplBe Bil = aay kb, Al] a ~ 214] nqQreim sit. 





Since H/ |\H | = —I on the negative-energy subspace V~, the above expres- 
sion reduces to Bini: (The result on the positive-energy subspace will 
differ by a crucial minus sign from what we have on V—.) 

Meanwhile, since both Band J are vector operators, we have, by Propo- 
sition 17.27, (1/(ih))[B;, Jn] = €jaBr and (1/(éh))[Jj, Je] = ejue Jt. From 
the commutation relations among the B;’ s and J; s, it is an easy calcula- 
tion to verify the claimed commutation relations among the components of 
TandK. = 


18.7 Exercises 


1. Consider the quantum Hamiltonian for two particles in R® interacting 
by means of a 1/r potential: 
2 h2 h2 2 
H = —— A, —- — A2- Be 


2m, 2m [xt — x2|° 


Here, as in Sect. 3.11, A; is the Laplacian with respect to the variable 
x! and Ag is the Laplacian with respect to the variable x?. As in 
Sect. 2.3.3, introduce new variables consisting of the center of mass, 


c = (m x!+mp2x”)/(m1+mz), and the relative position, y = x!—x?. 


Show that H can be expressed in these variables as 
Re 2 2 
Ag id Ay Q ‘ 
2(m1 + mz) Qu ly| 





where p is the reduced mass, given by 4 = mymz2/(m 1+ mz). 


Note: In the new variables, H is the sum of two terms, one of which in- 
volves only the variable c and one of which involves only the 
variable y. The term involving only c is the Hamiltonian for a free 
particle with mass m, + mz, whereas the term involving only y is the 
Hamiltonian for a particle of mass y moving in a 1/r potential. 
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. Let H(x,p) = |p|? /(2u) — Q2/|x| denote the Hamiltonian for the 
classical Kepler problem in R?. Show that for every ¢ > 0, the region 
in R® given by {(x, p) |H(x, p) < —e} has finite volume. 


. Let H denote the real span of the following four elements of M2(C): 


(a) Show that H forms an associative algebra over R, under the op- 


(b 


) 


eration of matrix multiplication, and that the following relations 
are satisfied: 


ij = —ji=k 
jk = —kj =i 
ki = —ik = 


The algebra H is (one particular realization of) the quaternion 
algebra. 

Show that each nonzero element of H has a multiplicative in- 
verse. 


Hint: Imitate the argument that each nonzero complex number has 
a multiplicative inverse. 


. Let H denote the quaternion algebra defined in Exercise 3. This ex- 
ercise establishes explicitly an isomorphism between the Lie algebras 
so(4) and so(3) 6 so(3) (compare Definition 16.14). 


(a) Let V be the subspace of H spanned by i, j, and k. Show that 


(b 


a 


NS 


V forms a Lie algebra under the bracket [a, 3] = a3 — Ba and 
that V is isomorphic as a Lie algebra to so(3). 

Let End(H) denote the algebra of real-linear maps of H to it- 
self. Given a € V, let Ly € End(H) be the “left multiplication 
by a” map, La(8) = a8, and let Ra € End(H) be the “right 
multiplication by a” map, Ra(8) = Ba. Show that the maps 
at> La and a+ —R, are Lie algebra homomorphisms of V 
into End(H). 

Consider the inner product on H in which {1,i,j,k} forms an 
orthonormal basis. Given a € V, show that 


(LaB,) == (B, Bay) 
(RaB,Y) a (B, Ray) : 


418 
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That is to say, La and Ry belong to so(4), which we identify 
with the space of elements of End(H) that are skew-symmetric 
with respect to the inner product in Part (c). 


— 
ion 
Sar 


Show that the map (a, 8) ++ La — Rg is a Lie algebra isomor- 
phism of so(3) @ so(3) to so(4). 

(e) Let D denote the diagonal subalgebra of so(3) © so(3), that is, 
the set of elements of the form (X,X). Show that the image of 
D under the isomorphism in Part (d) is the set of elements Y of 
so(4) C End(H) having the following form with respect to the 


basis in Part (c): 
0 0 
r(0 2) 


where Z € so(3). 


. Describe explicitly the two subalgebras of so(4) corresponding to the 


two copies of so(3) in the isomorphism 
so(4) = so(3) @ so(3) 


in Exercise 4. 


. Verify Lemma 18.18. 


Hint: First show that €jx1€jmn = 0 unless (k,l) = (m,n) or (k,l) = 
(n,m). 


. In this exercise, we use the summation convention of Sect. 18.4.1. 


(a) Show that for any 3 x 3 matrix M and any indices j,k,l € 
{1, 2,3}, we have 


EmnoMjmMinMio = Exkl (det M). 


(b) Show that if C is a vector operator, then for all R € SO(3), we 
have 
TI(R)C,U(R)~* = RixCi. 
(c) Show that the cross product of two vector operators is a vector 
operator. 
Hint: Write the definition of a vector operator in the equivalent 
form 


v-C=TI(R)((R-+v)-C)(R)~. 


. Compute [jn Ped; Pm| and [P;,Xm/ |X|] and show that both of 


these quantities are symmetric in 7 and m, meaning that the value is 
unchanged if we interchange j and m. 


. Show that the Eq. (18.14) has two power series solutions for g(p), one 


starting with p~°'*+) and one starting with p°. 
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Systems and Subsystems, 
Multiple Particles 


19.1 Introduction 


Up to this point, we have considered the state of a quantum system to 
be described by a unit vector in the corresponding Hilbert space, or more 
properly, an equivalence class of unit vectors under the equivalence relation 
w ~ ew. We will see in this section that this notion of the state of a 
quantum system is too limited. We will introduce a more general notion 
of the state of a system, described by a density matrix. The special case 
in which the system can be described by a unit vector will be called a 
pure state. 

One way to see the inadequacy of the notion of state as a unit vector is 
to consider systems and subsystems. We will examine this topic in greater 
detail in Sect. 19.5, but for now let us consider the example of a system of 
two spinless “distinguishable” particles moving in R®. (For now, the reader 
need not worry about the notion of distinguishable particles; just think of 
them as being two different types of particles, with, say, different masses 
or charges.) Let us assume the combined state of the two particles can be 
described by a unit vector in the corresponding Hilbert space, which is 
(according to Sect.3.11) L?(R°). We have, then, a wave function 2(x, y), 
where x is the position of the first particle and y is the position of the 
second particle. 

Given a wave function (x,y) for the combined system, what is the 
wave function describing the state of the first particle only? Jf the wave 
function of the combined system happens to be a product, say, w(x,y) = 
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w1(x)¥e(y), then, naturally, we would say that the state of the first 
particle is simply ~,. Of course, one might object that we could rewrite 
wp as W(x, y) = [cv (x)][W2(y)/c] for any constant c, but this only affects 
the wave function for the first particle by a constant, which does not affect 
the physical state. 

In general, however, the wave function of the combined system need 
not be a product. Already when yw is a linear combination of two prod- 
ucts, w(x, y) = v1(x)~e2(y) + ¢1(x)d¢e(y), it is unclear what the correct 
wave function is for the first particle. At first glance, it might seem nat- 
ural to try w(x) + ¢1(x), but upon closer examination, this is not an 
unambiguous proposal. After all, we can just as well write w(x,y) = 
[e111 (x) ] [We (y) /c1] + [c2¢1(x)][b2(y)/c2], but then the resulting wave func- 
tions for the first particle, w(x) + 2(x) and c1y1(x) + coy~e2(x), are not 
scalar multiples of one another. For a general unit vector 7) in L?(R°), the 
situation is even worse. The conclusion is this: There does not seem to be 
any way to associate to 7 a general unit vector ~’ in L?(IR°) such that w’ 
could sensibly be described as “the state of the first particle.” 

Although we cannot associate with y a wave function w’ for the first 
particle, there is no difficulty in taking expectation values of observables 
related to the first particle. We can make perfect sense of, say, the expected 
position of the first particle, as 


(v, Xv) = I, x; |W(x,y)|? dx dy. 


Here X a indicates the operator of multiplication by the jth component 
of the first vector in the function 7(-,-) : R? x R? > C. That is to say, 
the operator X; acting on L?(R*) can be “promoted” to an operator on 
L?(R®°) by having it act in the first variable only. Similarly, the momentum 
operator P; on L?(R*) can be promoted to an operator po on L?(R°), 


by letting it act on the first variable, meaning that POY is —7h times the 
partial derivative with respect to the 7th component of the first vector in 
w(-,-). In fact, as we will see in Sect. 19.5, given any self-adjoint operator 
on L?(R°), there is a natural way to promote it into an operator on L?(R°), 
where its expectation value may then be defined. 

Thus, although there is no natural way to associate with a unit vector 
w in L?(R®) a unit vector in L?(R%), there is a natural way to associate 
with ~ expectation values of observables on L?(R?). This suggests that we 
should introduce a more general notion of the “state” of a quantum system, 
a notion in which with each “reasonable” family of expectation values for 
the quantum observables there is associated a quantum state. This notion 
turns out to be that of density matrices (positive, self-adjoint operators 
with trace 1). 

In Sect. 19.3, we introduce the notion of a density matrix. Theorem 19.9 
in that section will tell us that, given any reasonable assignment @ of 
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expectation values to observables, there is a unique density matrix p such 
that (A) = trace(pA) for all observables A. In the special case in which 
the state of the system is given by a unit vector ~ in the Hilbert space, 
then p will be just the projection onto w and trace(pA) will be equal to 
the familiar expression (W, Aw). In Sect. 19.5, we will consider composite 
quantum systems and introduce a method (the partial trace) of defining a 
density matrix for a subsystem from a density matrix for the whole sys- 
tem. Finally, in Sect. 19.6, we will consider the important special case of 
composite systems made up of multiple identical particles. 


19.2 Trace-Class and Hilbert-Schmidt Operators 


In this section, we explore notions related to the trace of an operator on a 
Hilbert space. The results of this section are presented without proof; see 
Chap. VI in Volume I of [34] for proofs and additional information. 


Proposition 19.1 Suppose A € B(H) is non-negative and self-adjoint. 
Then for any two orthonormal bases {e;} and {f;} for H, we have 


Sie Ae) = 35 Ag). 


J J 
Note that since A is non-negative, (e;, Ae;) and (f;, Af;) are non-negative 
real numbers. Thus, the sums are always well defined, but may have the 
value of +co. 


Definition 19.2 [f A € B(H) is non-negative and self-adjoint, the value 
of a (e;, Ae;) , for any arbitrarily chosen orthonormal basis, is called the 
trace of A. If trace(A) < +00, then we say that A is trace class. 

For a general A € B(H), we say that A is trace class if the non-negative 
self-adjoint operator VA* A is a trace class. 





Note that for any A € B(H), A*A is self-adjoint and non-negative. Thus, 
the square root of A*A may be defined by the functional calculus (Defini- 
tion 7.13 or Proposition 8.4). 


Proposition 19.3 


1. If A € B(A) is trace class, then for any orthonormal basis {e;}, the 
sum ay, (e;, Ae;) is absolutely convergent. Furthermore, the value of 
this sum, which we denote as trace(A), is independent of the choice 
of orthonormal basis. 


2. If A € B(H) is trace class, then A* is also trace class and 


trace(A*) = trace(A). 
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3. If A € B(H) is trace class, then for all B € B(H), the operators AB 
and BA are also trace class, and 


trace(AB) = trace(BA). 


Recall that A € B(H) is said to be compact if A maps every bounded 
set in H to a set with compact closure. If a self-adjoint operator A is trace 
class, it is necessarily compact and thus has an orthonormal basis {e;} of 
eigenvectors, for which the associated eigenvalues \; are real and tend to 
zero as j tends to infinity. (See Theorem VI.16 in Volume I of [34]. One can 
deduce the result from, say, the direct integral form of the spectral theorem 
for bounded self-adjoint operators by verifying that unless A has point 
spectrum with eigenvalues tending to zero, the operator of multiplication 
by A in the direct integral will not be compact.) Point 1 of Proposition 19.3 
then tells us that >), |Aj| < co and that trace(A) = >7; \;. Conversely, if 
A is a self-adjoint operator having an orthonormal basis of eigenvectors for 
which the associated eigenvalues satisfy )7, |A;| < 00, then A is trace class. 


Definition 19.4 An operator A € B(H) is said to be Hilbert—Schmidt 
if trace(A* A) < oo. 


Since A*A is self-adjoint and non-negative, trace(A* A) is defined (but 
possibly infinite) for any A € B(H). If A is trace class, then (by definition) 
the trace of VA*A is finite, in which case, the trace of V A* Av A*A is also 
finite, by Point 3 of Proposition 19.3. Thus, every trace-class operator is 
Hilbert-Schmidt (but not vice versa). 


Proposition 19.5 If A € B(H) is Hilbert-Schmidt, so is A*. If A,B € 
B(H) are Hilbert-Schmidt, then AB and BA are trace class and trace(AB) 
equals trace(BA). 


If A and B are Hilbert-Schmidt operators, the Hilbert—Schmidt inner 
product of A and B is (A,B) p74 := trace(A*B) and the Hilbert-Schmidt 
norm of A satisfies Alles = (A, A) 75. The space of Hilbert-Schmidt 
operators is a Hilbert space with respect to (-,-) pg - 


19.3 Density Matrices: The General Notion 
of the State of a Quantum System 


Typically, we think of the quantum observables—the ones with expecta- 
tions values that we wish to take—as being unbounded self-adjoint oper- 
ators. But of course we can also take expectation values of bounded self- 
adjoint operators, and indeed expectations for bounded operators deter- 
mine those for unbounded operators. After all, suppose A is an unbounded 
self-adjoint operator and suppose we know the expectation value for 1~@(A) 
for every Borel set E C R, where 1g is the indicator function of EF and 
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1(A) is defined by the functional calculus (Definition 7.13). The expec- 
tation value for 1~(A) is the probability of obtaining a value in FE for a 
measurement of the observable A. If we know this probability for each FE, 
then we know the full probability distribution of the measurements, and 
thus we can compute the expectation value of A. Furthermore, we can 
always introduce expectation values for (bounded) non-self-adjoint opera- 
tors. Each such operator A is of the form A = A; +7A2 with A, and A» 
self-adjoint, and so we may reasonably define the expectation value of A to 
be the expectation value of A; plus i times the expectation value of Ag. 

We then postulate that the general notion of the “state” of a quantum 
system should be simply a “list” of expectation values for all bounded 
operators, satisfying some reasonable hypotheses. 


Definition 19.6 A linear map ® : B(H) > C is a family of expectation 
values if the following conditions hold. 


1. ®(f) =1. 

2. ®(A) is real whenever A is self-adjoint. 

3. B(A) > 0 whenever A is self-adjoint and non-negative. 
4 


. For any sequence A, in B(H), if \|Anw — Aw|| > 0 for all vd € A, 
then ®(A,,) + ®(A). 


Point 4 in the definition says that ® is continuous with respect to the 
strong (sequential) convergence in 6(H). By Exercise 3, any linear map 
on B(H) satisfying Points 1, 2, and 3 is automatically continuous with 
respect to the operator norm topology, meaning that if ||An —Al| > 0 
then ®(A,,) + ®(A). However, to establish our characterization of families 
of expectation values in terms of density matrices, we need continuity of 
® under a more general sort of convergence, where we only assume that 
||A,w — Aw|| > 0 for each w. This stronger continuity property does not 
follow from Properties 1-3. Exercise 5 gives an example of a linear func- 
tional on B(H) that satisfies Points 1-3 of Definition 19.6, but not Point 4. 


Definition 19.7 An operator p € B(H) is a density matriz if p is self- 
adjoint and non-negative and trace(p) = 1. 


Of course, since the trace of a density matrix is assumed to be finite, every 
density matrix is trace class. The next two results give a precise characteri- 
zation of families of expectation values in terms of density 
matrices. 


Proposition 19.8 Suppose p is a density matrix on H. Then the map 
®, : B(H) > C given by 


®,(A) = trace(pA) = trace(Ap) 


is a family of expectation values. 
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Proof. If we define ®,(A) = trace(pA), then ®,(I) = trace(p) = 1. For 
any A € B(H), we have, 


trace(pA*) = trace(A*p) = trace((pA)*) = trace(pA). 


It follows that trace(pA) is real when A is self-adjoint. Let p!/? be the non- 
negative self-adjoint square root of p. Then p!/? and Ap'/? are Hilbert— 
Schmidt (in the latter case, by Point 3 of Proposition 19.3). It follows that 
trace(Ap!/?p!/?) = trace(p!/? Ap'/?), by Proposition 19.5. Thus, if A is 
self-adjoint and non-negative, 


trace(pA) = trace(p'/?p'/?.A) = trace(p!/*-Ap'/?) > 0, (19.1) 


because p'/? Ap'/? is self-adjoint and non-negative. We have established 


that ©, satisfies Points 1, 2, and 3 of Definition 19.6. 

Meanwhile, suppose A,,7 converges in norm to Aw, for each w in H. 
Then ||A,,~|| is bounded as a function of n for each fixed w. Thus, by the 
principle of uniform boundedness (Theorem A.40), there is a constant C 
such that ||A,|| < C. Now, if {e;} is an orthonormal basis for H, we have 


2 
(ere ape)|=[(Men atte) | <cp'o 





and, 


2 
SS "ea => (putes, pi2e;) = S/ (e;, pej) = trace(p) < oo. 
j j 


Furthermore, since A,,(p'/?e;) converges to A(p!/? 


convergence tells us that 


trace(p/? Ap/?) = S~ Cr p'/Ap'/?e; ) 


. 4/2 1/2, 
dim, Do (en et/*Ane'/e5) 
J 


e;) for each j, dominated 


= lim trace(p!/?A,p'/?). 
n—->co 


As in (19.1), we can shift the second factor of p'/? to the front of the trace 
to obtain Point 4 in Definition 19.6. m 


Theorem 19.9 For any family of expectation values ® : B(H) > C, there 
is a unique density matria p such that ®(A) = trace(pA) for all A € B(H). 


Proof. Recall from Sect. 3.12 the Dirac notation, in which the expression 
|@W| denotes the linear operator taking any vector y € H to the vec- 
tor |¢)(#|x) (in physics notation), that is, the vector (q,x)@ (in math 
notation). If p is trace class, then by Exercise 2, 


trace(p|dXe|) = (w, pe) - 
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Thus, if an operator p with the desired properties is to exist, we must have 


(, 0G) = O(|@XY). 


Now, by Exercise 3, ® satisfies ||®(A)|| < ||A||. From this, we can see 

that the map 
La(o,¥) = &(|eYI) 

is a bounded sesquilinear form, so that (by Proposition A.63), there is 
a unique bounded operator p such that ®(|¢)(w|) = (w, pd) for all ¢ 
and ~w. Since |¢)(¢| is self-adjoint and non-negative, Le(¢, d) is real and 
non-negative, which means that p is self-adjoint (by Proposition A.63) and 
non-negative. 

Meanwhile, if {e;} is an orthonormal basis for H, then by Definition 19.2, 


N 


trace(p) = lim So (ej, pe;) 
j=l 


slim, ® (lexXer] +++ + lewXew!) 
= &(I) =1. 


In passing from the second line to the third, we have used Point 4 of 
Definition 19.6. Thus, p is a density matrix. 

We have now found a density matrix p such that 6(|¢)(~|) agrees with 
trace(p|¢)(q|) for all é,) € H. By linearity, ®(A) = trace(pA) for all finite- 
rank operators A (see Exercise 4). Now, if {e;} is an orthonormal basis for 
H, let Py be the orthogonal projection onto the span of e1,...,en. Then 
for any A € B(H), the operator Py A has finite rank and Py Aw — Aw for 
alld) € H. Thus, for all A € B(H), 


®(A) = lim ®(Py A) = lim trace(pPy A) = trace(pA), 
N—- oo N-0o 
by Proposition 19.8 m 


Our next result shows that our new notion of the state of a system 
includes our old notion. 


Proposition 19.10 For any unit vector y € H, let |w)(w|, in accordance 
with Notation 3.29, denote the orthogonal projection onto the span of w. 
Then |p| is a density matriz and for all A € B(H), we have 


trace(|)b| A) = (~, Ay). 


Note that if v2 = ev, then |wi)d1| = |v2Y2|. Thus, from our new 
point of view, we may say that the reason w, and wv» represent the same 
“physical state” is that they determine the same density matrix. 

Proof. Since it is an orthogonal projection, |W)(w| is bounded, self-adjoint, 
and non-negative. To compute its trace, we choose an orthonormal basis 
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{e;} for H with e; = , which gives trace(|~)(~|) = 1. Using the same 
orthonormal basis, we compute that, for any A € B(H), 


trace(|g)(y)| A) = 3 (e;, b) (ap, Aes) = (eb, Ad) , 
as desired. @ 


Definition 19.11 A density matriz p € B(H) is a pure state if there 
exists a unit vector wy € H such that p is equal to the orthogonal projection 
onto the span of w. The density matriz p is called a mixed state if no 
such unit vector w exists. 


An isolated system that is in a pure state initially will remain in a pure 
state for all later times, since the initial state qo evolves to the pure state 
e*Ht/hajy, where HH is the Hamiltonian for the system. But if a system is 
interacting with its environment, then as discussed in Sect. 19.5, the system 
may move into a mixed state at a later time. 

There are several different ways of characterizing the pure states as a 
subset of the density matrices. First, it is not hard to see (Exercise 6) that 
a density matrix p is a pure state if and only if trace(p?) = 1. Second, the 
set of density matrices is a convex set, since if p; and p2 are non-negative 
and have trace 1, then so is Ap; + (1 — A)po, for 0 < A < 1. According to 
Exercise 7, the pure states are precisely the extreme points of this set. That 
is, a density matrix p is a pure state if and only if it cannot be expressed 
as p = Ap, + (1 — A)po where p; and pe are distinct density matrices and 
X belongs to (0,1). Third, we may define the von Neumann entropy S(p) 
of a density matrix p by 


S(p) = trace(—plog p), 


where plog p is defined by the functional calculus. (Since lim)_,9+ Alog A = 
0, we interpret 0logO as being 0.) Since the eigenvalues of p are all be- 
tween 0 and 1, we see that —plogp is a non-negative self-adjoint operator, 
which has a well-defined trace, which may have the value +-oo. According 
to Exercise 8, a density matrix p is a pure state if and only if S(p) = 0. 

Suppose that we have two pure states, coming from unit vectors wy, and 
q2. Then there are two different senses in which we can take a superposition, 
that is, linear combination, of the corresponding quantum states. If we use 
our old point of view, in which the states are vectors in H, then we may take 
the linear combination cw; + c2W2, and then normalize this vector to be a 
unit vector. If we use our new point of view, in which the states are density 
matrices, then we may take the linear combination cy |W1)(w1| + c2 |Wwa)(ae] , 
where in this case c; and cz should be non-negative and should add to 1. 
These two notions of superposition are different, since 


C levy + coveXertd + cata] A cr Yi Xvi + c2 |WoX val, (19.2) 
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no matter how the constant C' is chosen. After all, the state on the left- 
hand side of (19.2) is a pure state, whereas (unless w. is a multiple of 71), 
the state on the right-hand side of (19.2) is a mixed state, since the range 
of this operator is 2-dimensional rather than 1-dimensional. 

Physicists call the first sort of superposition (in which we take a linear 
combination of vectors in H) coherent superposition or quantum superpo- 
sition, and they call the second sort of superposition (in which we take a 
linear combination of the associated density matrices) incoherent superpo- 
sition. The reason for the term “coherent” is that coherent superposition 
depends on the phases of the coefficients. That is, if ~, and 2 are linearly 
independent, the vector ce", + c2e’?y2 does not represent the same 
quantum state as cywv, + c2W2, unless e”? — e’?. By contrast, the density 
matrix associated with e’’y is the same as the density matrix associated 
with w, and so the phases have no effect when taking linear combinations 
of the density matrices associated to vectors in H. When taking a coher- 
ent superposition, there is no simple relationship between the expectation 
value of an observable in the states ~, and we and the expectation value 
of the same observable in the state c,w, + coy~2. On the other hand, when 
taking an incoherent superposition, expectation values in the new state are 
just linear combinations of the original expectation values: 


trace ((e1 |diX tal + c2 [PaXvel)A) = c1 (Yi, Adi) + ce (th2, Ay) . 


19.4 Modified Axioms for Quantum Mechanics 


We may now modify the axioms of quantum mechanics introduced in 
Sect. 3.6 to incorporate density matrices, beginning with our revised no- 
tion of a state. 


Axiom 6 The state of a quantum system is described by a density matrix p 
on an appropriate Hilbert space H. If A is any bounded operator on H, the 
expectation value of A in the state p is given by the quantity trace(pA) = 
trace(Ap). 


In Axiom 6, we assume that A is bounded, so that trace(pA) and trace(Ap) 
are defined and equal by Proposition 19.3. If A is unbounded and self- 
adjoint, we can construct a probability measure ae describing the proba- 
bilities for measurements of A in the state p, by the formula 


yi (E) = trace(p12(A)), 


where 1 (A) is defined by the functional calculus. 

We then define the expectation value of A in the state p as fp A dus (), 
provided the integral is absolutely convergent. If the integral is absolutely 
convergent, it is reasonable to hope that both pA and Ap will be densely 
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defined and bounded, that (the bounded extension to H of) these operators 
will be trace class, and that both trace(pA) and trace(Ap) will coincide with 
fer du\(A). We will not, however, enter into an investigation of this issue. 

Next, we propose a variant of Axiom 4, describing the “collapse of the 
wave function.” 


Axiom 7 Suppose a quantum system is initially in a state p and a mea- 
surement of a self-adjoint operator A with point spectrum is performed. If 
the measurement results in the value for A, then immediately after the 
measurement, the system will be in the state p’, where 


,_ 1 P. 
p= ZPxpPy. 


Here Py is the orthogonal projection onto the -eigenspace of A and Z = 
trace(P)pP,). 


Note that if p is non-negative, self-adjoint, and trace class, then P,pP 
is also non-negative, self-adjoint, and trace class. Implicit in Axiom 7 is 
the assumption that the measurement can only result in values for which 
Py pP, is nonzero. In particular, \ must be an eigenvalue for A. 

Finally, we introduce the notion of time-evolution for our new notion of 
“state.” 


Axiom 8 The time evolution of the state of the system is described by the 
following equation for a time-dependent density matrix p(t): 


— = ——I(p, HO]. 19.3 
This equation may be solved, formally, by setting 
p(t) = e tH/K 5 eit /h (19.4) 


where po is the state of the system at time t = 0. 


There are some domain issues involved in the interpretation of the equa- 
tion (19.3). Rather than entering into an examination of those issues here, 
we will simply take (19.4) as the definition of the time-evolution of a den- 
sity matrix. Presumably, if po is nice enough, then the map t +> p(t) will be 
differentiable as a curve in the Banach space 6(H) and its derivative will 
be (an extension of) the operator on the right-hand side of (19.3). By com- 
parison, it follows from Stone’s theorem and Lemma 10.17 that the family 
of pure states W(t) := e~"#/"uo satisfies the Schrédinger equation in the 
natural Hilbert space sense if and only if w% belongs to the domain of H. 
To see that the time-evolution in (19.4) is consistent with the previously 
defined time-evolution of pure states, observe that 


cAI yo] eI = Je HIM yo| = |OKOOL 


since (ennyn ye = e tH /h. 
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It should be noted that (19.3) differs by a minus sign from the time- 
evolution in the Heisenberg picture of quantum mechanics (Definition 3.20). 
Although this difference may seem strange, keep in mind that in Axiom 8, 
we are not adopting the Heisenberg point of view, in which the states 
are independent of time and the observables evolve in time. Rather, we 
are adopting a modified version of the Schrédinger picture, in which it 
is the states that evolve in time, but where the states are now certain 
sorts of operators. Even though both the states and the observables are 
now operators, the observables (in the Heisenberg picture) and the states 
(in the Schrédinger picture) must evolve in opposite directions in time, in 
order for the expectation values of the observables to be the same in the 
two pictures. 


19.5 Composite Systems and the Tensor Product 


As discussed in Sect. 3.11, the Hilbert space for two (nonidentical, spinless) 
particles moving in R? is £?(R°). Given a unit vector (i-e., a pure state) 
~ in L?(R®), the quantity |p(x', x?)|* represents the joint probability dis- 
tribution for the position x! of the first particle and the position x? of 
the second particle. The following result shows that L?(R®°) is naturally 
isomorphic to the Hilbert tensor product of two copies of the Hilbert space 
for the individual particles, namely L?(R?). 


Proposition 19.12 Suppose that (X1, 1) and (X2,pu2) are o-finite 
measure spaces. Then there is a unique unitary map 


p: L?(X1, p1)@L7(Xo, 2) > L?(X1 x Xa, pr X 2) 


such that 
P(O@ ¥)(x, y) = O(x)¥(y) 


for all b € L?(X1, 1) and w € L?(Xa, pa). 


Here © denotes the Hilbert tensor product defined in Appendix A.4.5. 
Proof. For simplicity of notation, we suppress the dependence of L? spaces 
on the measure, writing, say, L?(X1) rather than L?(X1, 1). Consider first 
the algebraic (i.e., uncompleted) tensor product L?(X1)@L?(X2). Using the 
universal property of tensor products, we can construct a linear map p of 
L?(X1) @ L?(X2) + L?(X, x X2) determined uniquely by the requirement 
that 


PP @ V)(a, y) = (x)H(y). 


Now, every element of the algebraic tensor product L?(X 1) @ L?(X2) can 
be expressed as a linear combination of elements of the form ¢; ® w;, with 
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o; € L?(X1) and w; in L?(X2). By computing on such linear combina- 
tions, we can easily verify that p is isometric. Thus, by the bounded linear 
transformation (BLT) theorem (Theorem A.36), p has a unique isometric 
extension to a map of the completed tensor product L?(X,)®L?(X2) into 
L? (Xy x Xp). 

It remains only to show that p is surjective. Since both measures are 
o-finite, it is a simple exercise to reduce the problem to the case where [1 
and jig are finite, which we henceforth assume. Suppose 7) € L?(X1 x X2) 
is orthogonal to the image of p. Then w is orthogonal to the indicator 
function of every measurable rectangle, and hence to the indicator function 
of any finite disjoint union of measurable rectangles. The collection A of 
such disjoint unions is an algebra of sets. Let M denote the collection of 
measurable subsets F of X, x X2 such that the integral of ~ over F is zero. 
Then M is a monotone class containing A. By the monotone class lemma 
(Theorem A.8), M contains the o-algebra generated by A, which is the 
o-algebra on which p41 X fg is defined. Thus, the integral of ~ over every 
measurable set is zero, which implies that w is zero almost everywhere. @ 

The preceding example suggests the following general principle. 


Axiom 9 The Hilbert space for a composite system made up of two sub- 
systems is the Hilbert tensor product H,®Hg of the Hilbert spaces H, and 
Hy. describing the subsystems. 


If A and B are bounded operators on H, and Hg, respectively, then there 
is a unique bounded operator A ® B on H,®Hpz such that 


(A ® B)(6@ y) = (Ag) ® (BY) 
for all ¢ € Hy, and ~ € Hg. (See Appendix A.4.5.) 


Theorem 19.13 Suppose that p is a density matrix on H,@H»2. Then 
there exists a unique density matrix p on Hy with the property that 


trace(p\) A) = trace(p(A @ 1)) (19.5) 


for all A € B(H,). We call p™ the partial trace of p with respect to Hg. If 
{fx} is an orthonormal basis for Hz, then the operator p) satisfies 


(6, Pv) = 35 (9@ fas Alb ® fr)) (19.6) 
k 


for all ¢, € Hy. Similarly, there is a unique density matric p?) on He 
satisfying trace(p?) B) = trace(p(I @ B)) for all B € B(Hy). If {e;} is an 
orthonormal basis for H,, then p) satisfies 


($, py) = 1 (e; ® o, ple; ® p)) (19.7) 


for all ¢,w € He. 
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The motivation for the terminology “partial trace” is provided by (19.6) 
and (19.7), which are similar to the formula for the trace of an operator, 
except that the sums range only over a basis for one of the two Hilbert 
spaces. One special case of Theorem 19.13 is the one in which the density 
matrix p is of the form p = p1® p2, where p; and p2 are density matrices on 
H, and Hg, respectively. (Any operator p of this form is a density matrix 
on H; x H2.) In that case, it is not hard to see that p) = py and p?) = po. 
We may describe this case by saying that the state of the first system is 
“independent” of the state of the second system. 


Lemma 19.14 For any sequence An, € B(H1), if ||Anw — Av|| > 0 for 
some A € B(H) and all x) € Hi, then 


(An @ 1)6— (A® D¢g|| > 0 
for all @ € H, @Hg. A similar result holds for operators of the form [® By. 


Proof. See Exercise 9. @ 

Proof of Theorem 19.13. The existence and uniqueness of p“) and p°? 
follow from Lemma 19.14 and Theorem 19.9. Meanwhile, if {e;} is an 
orthonormal basis for H; and {f;,} is an orthonormal basis for Hz, we 
have 


(@, pM a) = trace(p\ |pX¢l) 
=~ (e; ® fr, AlYXd| ® D(e; ® fr) 
j,k 


=~ (e; ® fi, p(w (d, €;) ® fr)) 
Uk 


= S- ys (e;,0) e; | © fe, p(w ® io) 
k 


gy 


=S°(¢@ fe 0b ® fr)). 
k 


This is the desired formula for (¢, p“)~) . Note that because p is trace class 
and |W)(¢| @J is bounded, p(|w)(¢| @ J) is trace class, in which case the sum 
in the second line is absolutely convergent, by Proposition 19.3. Thus, we 
are allowed to rearrange the sum freely. m 

Suppose we have two quantum systems with Hilbert spaces H; and Hz 
and Hamiltonians H 1 and Ap. If the two systems do not interact with each 
other and the composite system is initially in a (pure) state of the form 
oo ® Wo, then we expect that at some later time, the composite system will 


432 19. Systems and Subsystems, Multiple Particles 


be in the state ¢(t) @ w(t), where (t) = e~*41/hyy and W(t) = e~ttH2/n, 
Ignoring domain considerations, we may compute that 


d 


arr 


O(t) ® W(t)] = (Aid(t)) @ ¥@) + d(é) ® (Hav(t)) 
= (Hf, ®@1+1® He)(9(t) @ H(A). 


This calculation suggests that the correct Hamiltonian for a noninteracting 
composite system is the operator A, @1+1@ Ap. : ; 

It is not, however, obvious how to select a domain for H, ® J+ 1@® Ho 
in such a way that this operator will be self-adjoint. (The reader is invited 
to try to choose such a domain “by hand.”) The easiest way to deal with 
this issue is to use Stone’s theorem, as in the following definition. 


Definition 19.15 Jf A and B are self-adjoint operators on H, and Hg, de- 
fine the operator A@I+I@B to be the infinitesimal generator of the strongly 
continuous one-parameter unitary group eA @ e"®, Thus, by Stone’s the- 
orem, A@®I+1® B is self-adjoint. 


It is not hard to check that e4 @ e® is indeed strongly continuous. In 
the case B = 0, the operator A@ I is defined as the infinitesimal generator 
of e*#4 @T. If A and B happen to be bounded, then A®J+J@B defined by 
Definition 19.15 coincides with A @ 1+ J ® B defined as the sum of tensor 
products of bounded operators, as in Sect. A.4.5. 


Axiom 10 Suppose H; and Hz are the Hilbert spaces for two quantum 
systems, with Hamiltonians Fi, and Aa, respectively. Then the Hamiltonian 
for the noninteracting composite system is H,@I+1@Ho, where the domain 
of Hy @I+1@ Hp is as in Definition 19.15. 


A physicist would write A, QIl+1®@ Ho simply as Ay + Ao, with the 
understanding that Hi, acts only on the first factor in the tensor product 
and Hy acts only on the second factor. 

In general, the two components of a composite system will interact, in 
which case the Hamiltonian for the composite system is typically of the 
form ; ; ; 

H =H, @1+1@ Hy+ Hunt, 


where Hint is an “interaction term.” Often, the interaction term may be 
considered “small” compared with the other terms in the Hamiltonian. 
Consider, for example, a system consisting of particles in a box, with a 
barrier dividing the box in half. Suppose the pened interact by means of 
a two-particle potential of the form >), V(x’ —x ) (Sect. 2.3.2) and that 
V(x? — x") is very small unless the two particles are close together. There 
will typically be far more pairs of nearby particles in which the two particles 
are on the same side of the box than nearby pairs on opposite sides. Thus, 
even though the interaction between the two systems may substantially 
affect the behavior of the composite system over long periods of time, it is 
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still reasonable to think of H; @ I as “the energy of the first subsystem” 
and I ® He as “the energy of the second subsystem.” 

Suppose we start out in a state p of the composite system for which 
the state p“) of the first subsystem is a pure state. If the system is an 
interacting one, the first subsystem will probably not remain in a pure 
state at later times. Indeed, suppose that the second subsystem is very 
large system having temperature 7. Then, according to the postulates of 
quantum statistical mechanics, we are supposed to believe that once the two 
systems have reached thermal equilibrium, the state of the first subsystem 
will be given by the following highly mixed state: 


(1) a —BAy 
p Z(B) e , (19.8) 
Here 6 = 1/(kgT), where kg is Boltzmann’s constant, and Z7(() is a nor- 
malization constant, known as the partition function of the theory, given 
by Z(8) = trace(e~°"1). 

Of course, for this idea to make sense, e~°! must be trace class. This 
will be the case provided that H; has discrete spectrum with eigenvalues 
tending to +oo at some reasonable rate. Thus, in quantum statistical me- 
chanics, the expectation value of some observable A for the first subsystem 
will be (once equilibrium is reached) 


(A) = Firace(e A). (19.9) 


In particular, when A = Hj, (19.9) provides a natural generalization of 
Planck’s model of blackbody radiation; compare Exercise 2 in Chap. 1. 


19.6 Multiple Particles: Bosons and Fermions 


As discussed in Sect. 17.8, each type of particle (electron, proton, neutron, 
etc.) has a spin 1, where the possible value for | are 


The Hilbert space for a particle moving in R? and having spin I is L?(R°)® 
V,, where V; is a finite-dimensional Hilbert space that carries an irreducible 
projective unitary representation of SO(3) of dimension 2/ + 1. There is a 
natural unitary identification of L?(R°)@V, with L?(R°;V;), the space of 
square-integrable functions on R° with values in V;, in which the element 
wv ® v of L?(R*)eV; is identified with the function 


xH U(x) 


in L?(R°; V;). 
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Now, we have already mentioned, in Sect. 3.11, the idea that in quantum 
mechanics, identical particles are indistinguishable. Let us think about this 
in the case of two identical particles with spin /. Our first guess as to 
the Hilbert space for such a system is the tensor product of two copies of 
L?(R°; Vi), which may be identified with 


L?(R®; V, @ Vi). 


If ~ is a unit vector in this space, thought of as a pure state, then saying that 
the two particles are “indistinguishable” means that 7(x?,x!) should rep- 
resent the same physical state as 7(x!,x?), that is, =(x?,x!) = cw(x', x?) 
for some nonzero constant c. Applying this rule twice shows that c must 
be either 1 or —1. 

A variety of theoretical and experimental considerations suggest the fol- 
lowing principle: For particles with integer spin (J = 0,1,...), the constant 
c in the preceding paragraph is 1, whereas for particles with half-integer 
spin (1 = 1/2,3/2,...) the constant c is —1. Particles with integer spin 
are called bosons and particles with half-integer spin are called fermions. 
We encode the discussion in the two preceding paragraphs in the following 
axiom. 


Axiom 11 Consider a collection of N identical particles moving in R® 
and having integer spin l. Then the Hilbert space for such a collection is the 
subspace of L?(R°%;(V,)®%) consisting of those square-integrable functions 
w for which 


ab(x7) , 70) $3 ee = w(x!,x?,...,x%) 

for every permutation a. Consider also a collection of N identical particles 
moving in R® and having half-integer spin |. Then the Hilbert space for 
such a collection is the subspace of L?(R°%;(V;)®%) consisting of those 


square-integrable functions w for which 


(x7), x7) cee x7) = sign(o)b(x', ae Tre x) 


for every permutation o. 


One may well ask why Axiom 11 holds. More specifically, one may first 
ask why it is that identical particles are indistinguishable, and then sepa- 
rately ask why integer-spin particles are bosons and half-integer-spin par- 
ticles are fermions. Both questions are best answered from the point of 
view of quantum field theory, to which ordinary nonrelativistic quantum 
mechanics is an approximation. 

In field theory, one starts with a “classical” field theory, meaning a dif- 
ferential equation for functions ¢(x,t) on R* with values in some finite- 
dimensional vector space. Electromagnetic fields, for example, are—at any 
one fixed time—functions on R® with values in R®, where R® describes 
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the three components of the electric field and the three components of the 
magnetic field. These functions on R° then evolve in time according to 
Maxwell’s equation. In quantum field theory, one regards, say, Maxwell’s 
equations as a sort of infinite-dimensional dynamical system, which we may 
quantize in something like the way we quantize Newton’s equation to get 
ordinary nonrelativistic quantum mechanics. In the quantum version of 
Maxwell’s equations, the energy in each mode of the fields is “quantized,” 
meaning that one can only add energy to each mode in multiples of a certain 
unit (or “quantum” ) of energy. This is analogous to the quantum harmonic 
oscillator, in which the allowed energies differ by integer multiples of the 
hw. In quantum field theory, then, a particle is one quantum of excitation 
of a certain field. 

For simplicity, let us think of a field theory in which the classical fields 
take values in R. Then even at the classical level, it is possible to think 
that we have something like particles, namely localized bumps in the field 
(x) located at several different points in space. These bumps might, for 
example, be in the shape of a Gaussian wave-packet, that is, a Gaussian en- 
velope multiplied by a sinusoidally oscillating function. From this point of 
view, we can gain some understanding of why identical particles are indis- 
tinguishable. Suppose we have a Gaussian wave packet near a point a in R® 
and then an identically shaped Gaussian wave packet near another point b. 
The state $(x) of the field is precisely the same as if we have a packet near 
b and then also a packet near a. That is to say, there is no distinct state of 
the system that corresponds to interchanging the two particles; whichever 
bump we think of as the “first” particle, we have the same field ¢(x). Even 
in the quantum version of such a system, there no meaning to asking which 
is the first particle and which is the second. Thus, even in nonrelativistic 
quantum mechanics, which is a low-energy approximation to quantum field 
theory, we expect identical particles to be indistinguishable. 

Although the preceding discussion does not explain the distinction be- 
tween bosons and fermions, that distinction also emerges from quantum 
field theory, through something called the spin-statistics theorem 
(see, e.g., [38]). 


19.7 “Statistics” and the Pauli Exclusion Principle 


The spin of an electron is equal to 1/2 and electrons are, therefore, fermions. 
The famous Pauli exclusion principle is a consequence of the fermionic 
nature of electrons. Pauli’s principle states that two electrons cannot be 
in the same state at the same time. This means that if ~ is a square- 
integrable, C?-valued function on R? (which could describe the state of a 
single electron), then the function V : R° + C? @ C? given by 


U(x", x’) = (x!) ® Y(x’) 
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is not a possible state of a two-electron system, since V does not satisfy 
Axiom 11. On the other hand, if ~, and w2 are two linearly independent 
elements of L?(R°; C7), then the function ® : R° — C? @ C? given by 


®(x!, x”) = d1(x!)bo(x”) — do(x")ui (x?) (19.10) 


is a possible state of a two-electron system. [If #1, and w. are indepen- 
dent, then ® is a nonzero element of L?(R°;C? @ C?), which can then be 
normalized to be a unit vector. See Exercise 10.] 

Let us try to understand the implications of the Pauli exclusion principle 
for multielectron atoms. Let us model an N-electron atom as having a 
nucleus with positive charge Nq, where the charge of a single electron is 
—q. Since the nucleus is much more massive than the electrons, we can 
treat the nucleus as being fixed and the electrons as moving in potential 
of the form —Nq/ |x|. As a very rough approximation to the structure of 
such an atom, we can ignore the electron—electron interaction and take a 
Hamiltonian of the form 


where A? is the Laplacian acting on the jth variable. That is, we are taking 
our Hamiltonian to be simply 


(A @IQI®@---Q@1+((@HOIQ---@|+(I@I@H®---@I)+::-, 


where H is the Hamiltonian for a single electron. 
If, say, N is even, the lowest-energy state for this Hamiltonian in the 
antisymmetric subspace of L? (IR? ; (C?)®) will be 
Wye x7, ...5k”) 
= AS (a Gc!) @ Wo (2) @ aE O°) @--- @ Uh a(xY) @ Yigal). 
(19.11) 


If N is odd, the product ends with Pinto”). The notation in (19.11) 
is as follows. First, AS is the antisymmetrization operator, given by 


ASU ens”) _ > sign(o) f(x?) ,x7),... goer, 


aESN 


Second, the functions Wo, V1, W2,... are the eigenvectors in L?(IR*) for the 
Hamiltonian of a single particle in R? moving in a potential of the form 
—Nq?/|x|, arranged so that the eigenvalues of w; are weakly increasing 
with j. The 7,;’s are just the states computed in Chap.18, but with q 
replaced byVNq. Third, 7 (x) denotes ~j(x) @ e1 and y; (x) denotes 
w; (x) @ e2, where {e1, €2} is the standard basis for C?. 
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What the expression for Vp means is that, if we ignore (at first) the inter- 
action between the electrons, but retain the Pauli exclusion principle, then 
we put the first electron into the ground state of the single-electron system, 
with “spin up” (i.e., tensored with e;). Then we put the second electron 
into the ground state with “spin down” (tensored with e2). Then the third 
electron goes into the first excited state of the single-electron system with 
spin up, and so on. Of course, this model of a multielectron atom is very 
rough, since the interaction between the electrons actually plays a signif- 
icant role. Nevertheless, this model highlights the critical role played by 
the exclusion principle, which forces successive electrons to go into higher 
and higher energy states. In particular, this crude approximation suggests 
(correctly!) that even for more realistic models of a multielectron atom, the 
lowest energy level in the antisymmetric subspace of L?(R?%; (C?)®) is 
much higher than the lowest energy level of the same Hamiltonian in all of 
L? (R34; (C7) 2%), 

Meanwhile, in quantum statistical mechanics, one considers a large col- 
lection of identical particles confined to some finite region of space. If the 
system is isolated (rather than in thermal equilibrium with its environ- 
ment), the goal of statistical mechanics is to “count” the number N(£) of 
quantum states with energy less than E, as a function of E. [That is, N(E£) 
is number of eigenvalues for the Hamiltonian less than FE, counted with their 
multiplicity.] As the preceding discussion of the Pauli exclusion principle 
suggests, we will get very different answers for N(F) if the particles are 
fermions than if they are bosons. Bosons are said to follow Bose-Einstein 
statistics, whereas fermions are said to follow Fermi—Dirac statistics. The 
term “statistics” here refers to the different behavior of the two types of 
particles in quantum statistical mechanics. The spin-statistics theorem in 
quantum field theory tells us that particles with integer spin have to be 
bosons (obeying Bose-Einstein statistics) and particles with half-integer 
spin have to be fermions (obeying Fermi—Dirac statistics). 

One fascinating example of quantum statistical mechanics occurs when 
the particles are bosons and the interaction between particles is negligible. 
In that case, the lowest energy state will simply be 


Wo(x’,x’,...,x) = Yo(x') @ Yo(x”) ®--- @ Yo(x), 


where Wo is the ground state of the single-particle system. Now, quantum 
statistical mechanics tells us that at a given temperature, the state of the 
system will be an (incoherent) superposition of the ground state and the 
various excited states. If the temperature is low enough, then the coeffi- 
cient of the ground state will be close to 1, and thus, “all the particles are 
in the ground state.” A system in such a state is called a Bose-Einstein 
condensate, a state that was predicted on theoretical grounds by Satyendra 
Nath Bose and Einstein in the 1920s. Bose-Einstein condensates were first 
observed experimentally in laser-cooled gases in June 1995 by Eric Cornell 
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and Carl Wieman, in work for which they, along with Wolfgang Ketterle, 
were awarded the 2001 Nobel Prize in physics. 


19.8 Exercises 


1. Suppose that X is a Hilbert-Schmidt operator on H and that {e;} is 
an orthonormal basis for H. Show that 


2 2 
Xs = do Mey, Xex)|”- 
jk 


2. Given ¢, uw € H, let |d)(w| denote the operator defined in Notation 3.28. 
Show that if A € B(H) is trace class, then 


trace(A |6)|) = (, Ad) . 


Hint: If {e;} is an orthonormal basis for H, then for any x € H, we 
have x = >, (ej,x) ej: 

3. Suppose ® : 6(H) — C is a linear functional with the properties 
(1) that ®(A) is real whenever A is self-adjoint and (2) that ®(A) 


is real and non-negative whenever A is self-adjoint and non-negative. 
Show that if A is self-adjoint, then 


— |All @Z) < ®(A) < JA OC). 


Conclude that ® is bounded relative to the operator norm on 6(H). 
Hint: Show that if A is self-adjoint, then ||A|| [+ A and ||A|| J— A are 
non-negative. 


4. An operator A € B(H) is said to have finite rank if range(A) is finite 
dimensional. 


(a) Show that if A € B(H) has finite rank, then so does A”*. 


(b) Given A € B(H), show that A has finite rank if and only if there 
exist vectors ¢1,...,¢n and W1,...,Wn such that 


A=|¢iXhil +--+ [on Xbn - 


(c) Let A be any element of B(H), let {e;} be an orthonormal basis 
for H, and let Py be the orthogonal projection onto the span 
of e1,...,en. Show that Py A has finite rank and that for all 
w € H, we have 

|Pv Ay — Ax|| = 0. 


lim 
Noo 
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Note: This result shows that each bounded operator can be ex- 
pressed as a strong limit of finite-rank operators. By contrast, 
if dim H = ov, then Part (a) of Exercise 5 shows that not every 
bounded operator can be expressed as an operator-norm limit 
of finite-rank operators. 


5. In this exercise, assume that dim H = oo. 


(a) Show that if A has finite rank, then ||A + cI|| > |c| for any c € C. 
(With c = —1, this shows that I is not an operator-norm limit 
of finite-rank operators. ) 


(b 


pa 


Let K(H) denote the closure of the finite-rank operators with 
respect to the operator norm on 6(H). Let V denote the space 
of operators of the form B+ cI, with B € K(H). Define a linear 
functional ® : V > C by ®(B+ cl) =c for all B € K(H). Show 
that |®(A)| < ||A|| for all A € V. 


Note: It can be shown that K(H) is precisely the space of 
compact operators on H. 


— 
lo) 
Ww 


Let Y; : B(H) > C be any linear functional such that ¥,; = ® on 
V and such that |W,(A)| < || Al] for all A € B(H). (Such a func- 
tional exists by the Hahn—Banach theorem.) Let V2 : B(H) > C 
be defined by 

1 —— 
3 e(A) Wi A*)): 

Show that Wy. satisfies Properties 1, 2, and 3 of Definition 19.6, 
but that there does not exist any density matrix p such that 
W2(A) = trace(pA) for all A € B(H). (Thus, in light of 
Theorem 19.9, Wz must not satisfy Property 4 of Definition 19.6.) 


6. In Exercises 6, 7, and 8, assume that each density matrix p is 
compact, so that p has an orthonormal basis {e,;} of eigenvectors, for 
which the associated eigenvalues {A;} are real and tend to zero as j 
tends to infinity. (Compare Theorem VI.16 in [34].) 


Show that a density matrix p is a pure state if and only if trace(p”) = 1. 
7. (a) Show that each mixed state p is a nontrivial convex combination 
of other density matrices. 


(b) Show that a pure state cannot be expressed as a nontrivial convex 
combination of other density matrices. 


Hint: Show that the function f(A) := trace (or i= \)p2)”) is a 


convex function of X. 
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10. 
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For any density matrix p, show that the von Neumann entropy S(p) := 
trace(—p log p) is zero if and only if p is a pure state. 


. Prove Lemma 19.14. 


Hint: First use the principle of uniform boundedness (Theorem A.40) 
to show that there exists a constant C' with ||A,|| < C for all n. Then, if 
{ f;} is an orthonormal basis for H2, decompose H;®Hbp as the Hilbert 
space direct sum of the subspaces H; ® f;, where each of these subspaces 
is isometrically identified with H, in the obvious way. 


Suppose that ~, and wz are two linearly independent elements of 
L?(R3;C?). Show that the function ® in (19.10) is a nonzero element 
of L?(R®; C? @ C2). 


20 


The Path Integral Formulation 
of Quantum Mechanics 


We turn now to a topic that is important already for ordinary quantum 
mechanics and essential in quantum field theory: the so-called path inte- 
gral. In the setting of ordinary quantum mechanics (of the sort we have 
been considering in this book), the integrals in question are over spaces of 
“paths,” that is, maps of some interval [a, b] into R”. In the setting of quan- 
tum field theory, the integrals are integrals over spaces of “fields,” that is, 
maps of some region inside R@ into R”. Formal integrals of this sort abound 
in the physics literature, and it is typically difficult to make rigorous math- 
ematical sense of them—although much effort has been expended in the 
attempt! In this chapter, we will develop a rigorous integral over spaces of 
paths by using the Wiener measure, resulting in the Feynman—Kac formula. 
We begin with the Trotter product formula, which will be our main tool 
in deriving the path integral formulas. From there we turn to the (heuristic) 
path integral formula of Feynman, and then to the rigorous version of 
Feynman’s result obtained by M. Kac, the so-called Feynman—Kac formula. 
Although it is not feasible to give complete proofs of all results presented 
here, we give enough proofs to get a flavor of the mathematics involved. 
We will prove a version of the Trotter product formula and, assuming the 
existence of the Wiener measure, a version of the Feynman—Kac formula. 
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20.1 Trotter Product Formula 


The Lie product formula (Point 7 of Theorem 16.15) says that for all X 
and Y in M,,(C), we have 


eXt¥ — lim (Arete. 
m—- Ooo 


The Trotter product formula asserts that a similar result holds for certain 
classes of unbounded operators on Hilbert spaces. 


Theorem 20.1 (Trotter Product Formula) Suppose that A and B are 
self-adjoint operators on H and that A+B is densely defined and essentially 
self-adjoint on Dom(A)M Dom(B). Then the following results hold. 


1. For all yw € H, we have 


‘ it(A+B),), (it A/N _itB/N\N | 
im e w—(e € yy. (20.1) 





2. If A and B are bounded below, then for all w € H, we have 


its jet Oy _ Cul ma | (20.2) 


N- oo 


In both results, the expression A+ B refers to the unique self-adjoint ex- 
tension of the operator defined on Dom(A) 1 Dom(B). 


In the usual terminology of functional analysis, (20.1) asserts that the 
operators (e4/N e#B/N)N converge to e(4+) in the “strong operator 
topology,” and similarly with (20.2). 

We will give a proof of this result in the special case in which A + B 
is densely defined and self-adjoint on Dom(A) M Dom(B). This condition 
holds, for example, whenever the Kato-Rellich theorem (Theorem 9.37) 
applies. See Sect. A.5 of [14] for a proof of the version stated above. 
Proof. Since all the operators in Point 1 of the theorem are unitary, it 
is easy to see that if the result holds on some dense subspace W of H, 
it holds on all of H. In Point 2 of the theorem, we first make a simple 
reduction to the case where A and B are non-negative, and then have the 
same conclusion, since all operators involved will then be contractions. 

We will prove Point 1 of the theorem, with the proof of Point 2 being sim- 
ilar. Let us introduce the notation S, := e*4+) and T, := e#*4e**8. What 
we want to prove is that for each ~ € H, the quantity ||(S;— (Ti) )p|| 
tends to zero as N tends to infinity. Now, a simple calculation shows that 


N-1 
Il (Se = (Tw) || = (Tin)? (Sein — Try (St )s 3 - (20.3) 
0 


= 
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Since S. is a one-parameter unitary group, (St/w)%~J~-'w = Ss, where 
s=(N—j-—1)t/N. Thus, if we let w, = S,w, we have 


I|(St — (Tey) || < N sup ||(Stzw — Try) es|] - (20.4) 
O<s<t 


Now, for any wv in Dom(A + B), we have 
lim N(Synp—p) = it(A + B)y, 
N-0o 
by Stone’s theorem. Meanwhile, according to Exercise 2, we have 


1 
lim -(T, — Tp =iAy +iBy, (20.5) 
s30 § 
for all 7 € Dom(A)N Dom(B). (This result is clear at the heuristic level.) 
Thus, 


Nim, N(St/n — Tin )b = Nim N (Stn =D) Jim N (Dip —I)p 
= it(A + By — it(A + By =0 (20.6) 


for every w € Dom(A) 7 Dom(B). 
Let W = Dom(A) NM Dom(B), which is, by assumption, dense in H, 
equipped with the norm ||-||, given by 


ella = ll + (A+ Byoll. 


Since we are assuming A + B is self-adjoint, and thus also closed, on W, 
we see that W is a Banach space with respect ||-||, (Exercise 6 in Chap. 9). 
Now, the operators N(S;;n~ — Ty/n) are certainly bounded from W to H, 
for each N. Furthermore, (20.6) shows that for each 7 € W, we have 


eg IN (Stynv al Tn )o|| < OO. 


Thus, by the principle of uniform boundedness (Theorem A.40), there is a 
constant C such that 


|| N(Sizv = Trn)¥|| <Cwlh 


for all 7 € W. It then follows (Exercise 3) that ||N (Si — Try )|| tends 
to zero uniformly on every compact subset of W. 

Suppose, now, that for each wy € W, the s > w, is continuous in W. If 
so, the image of the compact interval [0,t] under s +> w, will be compact 
in W, and so || V(Siyv — Tin) ¥s|| will tend to zero uniformly in s. Thus, 
by (20.4), we will have Point 1 of the theorem. To establish the desired 
continuity, we first note that by Lemma 10.17, the operators S, = e’*(4+) 
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preserve Dom(A + B), which is equal to W, by assumption. Then for any 
s,r € [0,t] and w € W, we have 


= ener = er are)yal 4 





ciS(AtB) qh, — ei (4+B) yl 
1 





ei(AF Bg) — et At Ppl 4 (a+ B)(ci(ATB) yy — ei(A+P)y)| 


(et (ATB) _ eir(AtB))( 4 4 Ball , (20.7) 














where we have used Lemma 10.17 again in the second equality. The strong 
continuity of e’*“4+) (Proposition 10.14) then ensures that the right-hand 
side of (20.7) tends to zero as s approaches r. & 


20.2 Formal Derivation of the Feynman Path 
Integral 


In this section, we apply Point 1 of the Trotter product formula to the 
operator 


1. oA 1 
pu SS A= V(X). (20.8) 





Let us call the operators on the right-hand side of (20.8) A and B, re- 

spectively, and let us assume V is sufficiently nice that H is essentially 

self-adjoint on Dom(A) 7 Dom(B). Any bounded potential certainly has 

this property, as do many unbounded potentials. (See, e.g., Theorem 9.38.) 
Point 1 of Theorem 20.1 then tells us that 


: 7 N 
Sa ; tha | _ itV(X) 
° = (exp { 2m f oP Ni Y 


Under mild assumptions on w, Theorem 4.5 (extended to n dimensions) 
tells us that exp(#thA/(2mN)) may be computed as 





n/2 
; mN .mN 
eith/2MN)ah(xq) - ( oa ) | exp {i Ix1 — xol*} (xi) der. 
R” 


Meanwhile, exp(—itV (X)/(Nh)) is simply a multiplication operator. 
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Thus, assuming that Theorem 4.5 applies at each stage, we have 


ithA i _ 
(oo {28} {- SY)" a] 
_ mN 2 itV (x1) 
=f exp {iS pr — x0 \exp{- Ni \ 
xf ex{i im xw_—1 — xy-al? bexp | —“V w=) | 
xox [exp {ie oe Ixnv — Xn-1| *t exp { in| 


x w(xn) daxn daxn—1 eae dx1, 








where C = (mN/(ith))”%/?. Letting « = t/N and assuming we can freely 
rearrange the order of integration, we obtain 


(e~#H/hy) (xo) 











. oN 
a m xj — xj iy ; 
a ayaE e ve) 
(R”) j=l 
x (xn) dx, dxg---dxy. (20.9) 


So far, the argument is mostly rigorous, coming from the Trotter product 
formula and Theorem 4.5. The nonrigorous part comes in attempting to 
evaluate the limit on the right-hand side of (20.9). Let us think of the 
values x;, j = 0,...,N as constituting the values of a path x(s) at the 
points s; := je = jt/N: 

x; = x(jt/N). 
Since the distance between s;_, and s; is c, the quantity |x; — x;_1|/e¢ is 
an approximation to the derivative of #(s) with respect to s. Meanwhile, 
the sum over j in the right-hand side of (20.9) is an approximation to an 
integral. Thus, if we then take the limit of the right-hand of (20.9) in a 
totally nonrigorous fashion, we obtain 


(e~#F/P) (x0) 


i film 
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x(0)=xo 


dx |? 
ds 











viet) al w(x(t)) Dx. 
(20.10) 
Here, C is a normalization constant and Dx is something like “Lebesgue 


measure” on the space of all paths x(-) mapping [0, t] into R”. (The quantity 
x in the expression Dx is a path, not a point in R”.) 
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The reader who is familiar with the Lagrangian approach to mechanics 
will recognize the expression in square brackets in the exponent on the 
right-hand side of (20.10) as the Lagrangian of the particle, L = T — V, 
where T' = (1/2)m|v|* is the kinetic energy and V is the potential energy. 
The integral of the Lagrangian over some time interval is called the action 
functional, denoted by the letter S. That is to say, given a path x(-), we 
define the action functional of x(-) over a time-interval [a, b] as follows: 


SiS aty= [ E 


In Lagrangian mechanics, one shows that the solutions to Newton’s law are 
precisely the stationary points of the action functional. Using the notation 
in (20.11), we may rewrite (20.10) as 


dx 
ds 











; vist] ds. (20.11) 


(HHMI G20) = C Pore win 8 { FSEO0-0f LO) Dx. (20:12) 


x(0)=xo 


This formula is the Feynman path integral formula. 

Now, knowledge of Lagrangian mechanics is not directly relevant to the 
derivation of the Feynman path integral formula. Nevertheless, it is intrigu- 
ing that the an important quantity from classical mechanics should appear 
in the Feynman path integral formula in quantum mechanics. Indeed, this 
appearance raises the possibility that one can use the path integral formula 
to make connections between quantum mechanics and classical mechanics. 
Indeed, the “method of stationary phase” (when applied, formally, in an 
infinite-dimensional setting) asserts that for small values of h, the main 
contribution to the right-hand side of (20.12) comes from regions near the 
stationary points of the action functional, namely the classical trajectories. 
Using this method, Gutzwiller was able to derive his famous trace formula, 
which provides predictions of typical eigenvalue spacings for Schrodinger 
operators based on the behavior of the underlying classical system. More 
information about this fascinating subject can be found in books on “quan- 
tum chaos,” including [19] by Gutzwiller himself. 

It is notoriously difficult to attach a rigorous meaning to the right-hand 
side of the Feynman path integral formula. Note that the formal expression 
“Dx” is the limit as N tends to infinity of the integral over (R")% in 
(20.9) with respect to the Lebesgue measure (i.e., the measure given by 
dx; dx2---dxy). Thus, “Dx” should be something like Lebesgue measure 
on the space of all paths (maps from [0,¢] into R”). However, it is known 
that an infinite-dimensional vector space (say, a Banach space) does not 
have any “reasonable” (say, o-finite) translation-invariant measure that 
could play the role of Lebesgue measure. Furthermore, the absolute value 
of the constant C’ is easily seen to be infinite. Thus, we certainly cannot 
take the right-hand side of (20.12) literally. 
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A better approach is to avoid looking at the component parts of the 
Feynman path integral and instead to look at the whole expression against 
which the function 7(x(t)) is being integrated. If we could attach a rigorous 
meaning to the expression 


Cexp{ Z5(x(),0.0)} Dx, (20.13) 


as, say, a complex-valued measure on the space of continuous paths, then 
this could serve to give a meaning to the path integral. It is known, however, 
that there is no complex measure on the space of paths that makes the 
Feynman path integral formula true. The oscillatory behavior produced by 
the 7 in the exponent in (20.13) makes it difficult to give a rigorous meaning 
to the Feynman path integral in its original form. 


20.3 The Imaginary-Time Calculation 


In trying to give a rigorous meaning to the path integral formula of Feyn- 
man, Kac proceeded by considering the “imaginary time” time-evolution 
operator exp(—tH/h), which is just the usual time-evolution operator 
exp(—itH/h) evaluated with t replaced by —it. The idea is that if one 
can use path integrals to understand the operators exp(—tH/h), one can 
go back to the “real time” operator exp(—itH/h) by analytic continuation 
with respect to t. 

The counterpart of Theorem 4.5 for exp(—thA/(2m)) (proved in the 
same way) is 


(e~tA/(2™) 9) (x9) = (ay exp {- 3 x1 — xl} (a1) dx}. 


Unlike Theorem 4.5, however, the above expression holds for all ~ € L?(R”), 


with absolute convergence of the integral for every x9 € R”. Applying the 
Trotter product formula and rearranging the integral as before gives 


(e*#/ah) (x0) 











N 
7 1 ™M |X Xj—1 
= JimC oy OP =o E : Ves.) 
(R”) j=l 
x w(xn) dx, dx ae -dxyn. (20.14) 


If V is, say, bounded below, then there is no difficulty in changing the 
order of integration, because of the rapid decay of the integrand. Note that 
there is a relative sign change between the two terms in square brackets, 
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compared to (20.9). Taking a formal limit as before gives 


(e- #/Fy) (x) 


1 film 
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x(0)=xo0 








7 vist] | w(x(t)) Dx. 
(20.15) 


Note that the integral in the exponent on the right-hand side is not the 

classical action in (20.11), because the potential term has the wrong sign. 
Kac’s idea was to separate out the quadratic part of the exponent on the 

right-hand side of (20.15) and attempt to interpret the expression 


1 f'm 

C exp {- : | 5 

as a measure on the space of paths. Specifically, this is a Gaussian measure, 

one with a (formal) density with respect to the Lebesgue measure that is 

the exponential of a quadratic expression. There is a well-developed the- 

ory of Gaussian measures on infinite-dimensional spaces. Although there 

is no Lebesgue measure in the infinite-dimensional case, one can construct 

Gaussian measures as limits of Gaussian measures on spaces of large finite 
dimension. 


dx 
ds 








2 
al Dx (20.16) 


20.4 The Wiener Measure 


Kac identified the formal expression in (20.16) as the Wiener measure. To 
be precise, for each fixed x9 € R, there is a Wiener measure [ix,, where Lx, 
is supported on the set of paths x : [0,t] > R with x(0) = x9. The Wiener 
measure was developed by Norbert Wiener as a rigorous embodiment of 
Albert Einstein’s mathematical model of Brownian motion. Einstein, in one 
of his 1905 papers, had proposed that the random motion of a very small 
particle in water was due to collisions between the particle and the water 
molecules. Einstein postulated that the increments of a Brownian path 
x [quantities of the form x(t) — x(s)] should be independent for disjoint 
time intervals and should be normal random variables with mean zero and 
variance proportional to t — s. The following theorem shows that there 
is a unique measure on the space of continuous paths satisfying Einstein’s 
criteria. Let Cx, ((0, t];IR”) denote the space of continuous maps x(-) of [0, t] 
into R” satisfying x(0) = xo, equipped with the supremum norm. 


Theorem 20.2 (Wiener) For each vector xo € R" and each pair of pos- 
itive numbers o and t, there exists a unique measure wx, on the Borel o- 
algebra in Cx,([0, t];R”) such that the following condition holds. For each 
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sequence 0 = to < ty <---<ty <t of real numbers and each non-negative 
measurable function f on (R")%, we have 


| f(x(t),x(te), ....2(tv)) du, (2) 
xq ([0,#];R”) 





N 2 
1 — xX; 
=C exp ) atk i= eu f(x1,X2,-.-.-,Xn) dx1-:: dxy, 


where 





Note that the right-hand side of (20.17) is extremely similar to the right- 
hand side of (20.14), except that there are no terms involving the potential 
V in the exponent in (20.17). Thus, it is reasonable to think that the Wiener 
measure is a rigorous version of the formal expression in (20.16). It should 
be noted, however, that the heuristic expression (20.16) is misleading in one 
important respect. That expression suggests that the measure is supported 
on paths x(-) for which dx/dt belongs to L?({0, t]; R”), since the exponential 
factor would seemingly “damp out” any paths for which this is not the case. 
This conclusion is, however, incorrect. [One should, in general, be extremely 
cautious in drawing conclusions based on purely formal expressions such as 
the one in (20.16).] Actually, the “typical” path with respect to the Wiener 
measure is nowhere differentiable; that is, the set of paths x(t) that are 
differentiable for even one value of t form a set of measure zero. 

This discrepancy is actually a general feature of Gaussian measures on 
infinite-dimensional spaces: They are always supported on a larger space 
than the formal expression would suggest. In the case of the Wiener mea- 
sure, the space on which the measure actually lives (the space of continuous 
functions) is nice enough that no difficulties arise in the formulation of our 
main result, the Feynman—Kac formula. In the setting of quantum field the- 
ory, however, issues concerning the support of a Gaussian measure become 
serious difficulties. See Sect. 20.6 for more information. 


20.5 ‘The Feynman—Kac Formula 


The Wiener measure gives a rigorous interpretation to the expression in 
(20.16). Thus, the Wiener measure encapsulates everything in (20.15) ex- 
cept for the term involving V in the exponent and the factor of ~(x(t)). 
This reasoning accounts for the form of the following result. 
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Theorem 20.3 (Feynman—Kac Formula) Suppose V : R? > R can be 
expressed as the sum of a function in L?(IR°) and a bounded function. Then 
for all x9 € R°, we have 


(e~#/Pah) (x0) 


“, 


where up, is the Wiener measure on Cx,({0,t];R°®) and where o = h/m. 


{1 i “V(x(s)) ass (x(t) du, (%), 


exp 
xo ([0,t];R*) 


Of course, similar results hold in other dimensions, under suitable as- 
sumptions on the potential. We refer the interested reader to [37] or [14] 
for details on different versions of the Feynman—Kac formula. Theorem 20.3 
cannot be obtained directly from the Trotter product formula, because the 
limit in (20.14) is an L? limit rather than a pointwise limit. We will con- 
tent ourselves with proving an “integrated” version of the Feynman—Kac 
formula for nice potentials; Theorem 20.3 is Theorem 6.5 of [37]. 


Definition 20.4 Let C((0,t];R”) denote the space of all continuous paths 
on [0,t] with values in R". For all o > 0, let p° be the measure on 
C([0,t];R”) given by 


WE) =f nS,(B) dro. 


Proposition 20.5 Suppose V :R” > R is bounded and continuous. Then 
for all 6, € L?(R"), we have 


(p, et /Py 
ee 1 re ; 
7 [ con KO) exp {—z f V(x(s)) ash o (xl) du’ (x), 


where ° is as in Definition 20.4 and where o = hi/m. 


Proof. We begin with (20.14) and apply Theorem 20.2 with parameters 
chosen as follows. We take o = h/m, we take the sequence (t,;) to be given 
by t; = jt/N, and we take f to be the function given by 


f(x1,X2,-..,Xw) = (xy). 


Theorem 20.2 then allows us to express the right-hand side of (20.14) as 
an integral against the Wiener measure, giving 


N . 

1 t jt 

= lim exp « —= ) —V (x (+)) w(x(t)) dus, (x). 
N-00 Jo, ([0,t];R”) ariel N 


j=l 
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Since the limit in the above equation is an L? limit, we may move the 
inner product with ¢ inside the limit on the right-hand side. The integral 
with respect to w%, and the integral with respect to dxg may then be 
combined into a single integral with respect to 4°, giving 





(6,e0#/Mb) = tim o(x(0)) 
oe SC ((0,t};R”) 


xb 43 xv (x (*)) wROy de®. (20.18) 


Now, since V is continuous, 


im, 3 nv (x (+)) z 7. V(x(s)) ds, 


for every continuous path x. Furthermore, it is easily seen that the “distri- 
bution” of the quantity x(s) with respect to the measure ju” is the Lebesgue 
measure on R”, for any s € [0,¢]. Thus, the function x +> $(x(0)) is 
square-integrable with respect to 7, with L? norm equal to the L? norm 
of ¢ over R”, and similarly for x + 7(x(t)). It follows that the quantity 
¢(x(0))w (x(t)) is an L* function on C({0,t];IR"). Since V is bounded, we 
may apply dominated convergence to move the limit inside the integral, at 
which point we obtain the desired result. m 





20.6 Path Integrals in Quantum Field Theory 


In this section, we briefly discuss the path integral approach to quantum 
field theory. We consider quantum field theory in a space-time of dimension 
d, so that space has dimension d—1. The configuration space for the classical 
version of the theory is the collection of “spatial” fields, that is, maps $(x) 
of R¢—! into some finite-dimensional vector space V. A path in the space 
of fields is then a map ¢(x,t) of Ré! x R & R? into V. In the path 
integral approach to quantum field theory (which is the most commonly 
used approach to the subject), one considers integrals over the space of 
such paths. 

Let us consider, as a simple example, what is called ¢* theory. In this 
theory, the fields ¢ map into R and we consider a path integral of the form 


cf exw {= [ [erllVoel? +2009? + erst") ax} 


Fa 
x F(¢) Dé, (20.19) 


for some functional F'(¢) on the space of fields. [The expression in (20.19) 
is, more precisely, a “Euclidean” or “imaginary time” path integral. Such 
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an integral is the counterpart in quantum field theory of the integral occur- 
ring in the Feynman—Kac formula in quantum mechanics.] In (20.19), Fa 
represents the space of all “fields” (i.e., functions) mapping our space-time 
R¢ into R. In an attempt to make sense of this heuristic expression, we 
may follow the strategy we used in deriving the Feynman—Kac formula by 
separating out the quadratic part of the exponent. We look, then, for a 
measure {4 on Fq given by the heuristic expression 


du(d) “=” Coxp{—F / ; [ex V(x)? + e2o(x)?| ax} Do. (20.20) 


Using the theory of Gaussian measures, one can construct a rigorously 
defined measure corresponding to the heuristic expression in (20.20). There 
is, however, a serious difficulty with this approach: The measure ju is sup- 
ported on very “rough” fields, much rougher than the heuristic expression 
suggests. In fact, we have the following result. 


Proposition 20.6 For all d > 1, there exists a Gaussian measure on the 
space Fa of fields on R¢ corresponding to the heuristic expression (20.20). 
For d > 2, however, this measure is not supported on any space of ordinary 
functions, but rather on a space of distributions. 


We will not prove this result here; see Sect. 8.5 of [14] for more informa- 
tion. Here, then, is the problem with the path integral approach to quantum 
field theory on space-times of dimension d > 2: The functional [,, 6(x)* dx 
does not make sense for a “typical” field with respect to the measure pw in 
(20.20). As a result, we cannot make sense of (20.19) simply by absorbing 
all the Gaussian part into the definition of the measure pu, since what is 
left over is not a y-almost everywhere defined functional of @. Indeed, even 
a local integral, of the form f,, 6(x)* dx for some bounded region U in 
R¢, fails to be almost-everywhere defined with respect to p. After all, if 
te ¢(x)* dx made sense, then ¢ would be a locally L* function, rather than 
a distribution. 

It should be emphasized that the difficulty described in the previous 
paragraph is not just a technicality that can be swept away by some simple 
trick. Furthermore, this difficulty is not specific to ¢* theory, but is present 
in all “nontrivial” field theories. In all interesting field theories, the fields 
defined by the Gaussian part of the path integral are fundamentally “too 
rough” to allow us to make sense of the non-Gaussian part of the integral. 
This phenomenon is the fundamental mathematical difficulty in the path 
integral approach to quantum field theory. 

To have a chance to make rigorous sense of path integrals in quantum 
field theory, one has to employ a complicated regularization process known 
as renormalization. This process has, so far, been carried out in a rigorous 
fashion only for a very small number of field theories. One of the Clay 
Millennium Prize problems is to make rigorous sense out of the Yang—Mills 
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field theory in four space-time dimensions. See [14] for a detailed survey 
of the mathematical issues connected with the path integral approach to 
quantum field theory. See also [13] for a treatment of quantum field theory 
and renormalization with a greater eye toward the physical content. 

Since the roughness of the fields is a major problem in trying to give a 
rigorous meaning to path integrals, let us think for moment why it arises. 
Suppose we wish to construct a Gaussian measure from a certain heuristic 
expression of the form pp = Ce~°)Dz, where Q is a positive-definite 
quadratic functional of z. A reasonable approach is to consider the (real) 
Hilbert space H for which Fale = Q(z). [In the case of (20.20), H would 
be the “Sobolev space” of fields having one derivative in L?.] The heuristic 
expression for the Gaussian measure then takes the form 


du(x) = Cell Der. (20.21) 


One might now try to approximate yz by Gaussian measures [ty on 
Hilbert spaces Hy of dimension N < oo. If dimH < oo, then the expres- 
sion (20.21) is perfectly rigorous, where the constant C may be taken to 
normalize js to be a probability measure. A simple calculation (Exercise 4), 
however, shows that for any R, we have 


Nim vn (Br.w) = 0, 


where Bry denotes the ball of radius R in Hy. This means that in the 
N — oo limit, all of the “mass” of the measure is outside the ball of radius 
R, for every R. Thus, in the limit, the measure is supported entirely on 
points x where ||z||,,; = oo, that is, on points that are not actually in H. 
The measures jn do converge to a measure 4 as N tends to infinity, but 
i does not live on H, but on some larger space B > H. The original space 
H is a set of -measure zero inside B. See [16] for more information. In the 
case of the measure p corresponding to the heuristic expression in (20.20), 
js does not—as the expression suggests—live on the Sobolev space of fields 
with one derivative in L?, but on a larger space, which turns out to be a 
space of distributions. 


20.7 Exercises 


1. Verify the identity (20.3) in the proof of the Trotter product formula. 
2. Verify (20.5) in the proof of the Trotter product formula, using Stone’s 
theorem and the following identity: 


1 (cis gisB _ Tw = eA (i By) eisA (Ste? Io iBv) 





Ss 


+4 (aie = I)y. 


Hm | 
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3. Suppose {Ay} is a family of bounded operators mapping a Banach 
space W, to a Banach space W2. Suppose that for some constant C, 
we have ||Ay|| < C for all N. Finally, suppose that ||Aj~|| > 0 as 
N — ov, for every w € W. 


(a) 


Show that for each w% € W and each € > 0, there exists a neigh- 
borhood U of ~ and an integer M such that 


|| Awl] < € 


for all @€U and N > M. 


If K is a compact subset of W, show that ||A,7|| tends to zero 
uniformly for w € K. 


Let Hy be an N-dimensional Hilbert space. Show that the mea- 
sure , 
dun (a) = 18/2 e lel dar 


is a probability measure. Here dx is the Lebesgue measure on 
Hy, normalized to that the unit cube has volume 1. 


Hint: Use Proposition A.22. 


Let Br,n denote the ball of radius R in Hy. Show that for each 
R< o, there exists number ar < 1 such that 
n(Br,n) < (ar)”. 


Thus, lim y—3.00 bon (Brn) = 0. 


Hint: The ball Br,w is contained in a cube centered at the origin 
with side length 2R. 


ZA 


Hamiltonian Mechanics on Manifolds 


In this chapter, we generalize the Hamiltonian approach to mechanics (in- 
troduced already in the Euclidean case in Sect. 2.5) to general manifolds. 
The chapter assumes familiarity with the basic notions of smooth mani- 
folds, including tangent and cotangent spaces, vector fields, and differen- 
tial forms. These notions are reviewed very briefly in Sect. 21.1, mainly in 
the interest of fixing the notation. See, for example, Chap. 2 of [40] for a 
concise treatment of manifolds and [29] for a detailed account. Throughout 
the chapter, we will use the summation convention, that repeated indices 
are always summed on. 


21.1 Calculus on Manifolds 


Throughout this section, M will denote a smooth, n-dimensional manifold. 


21.1.1 Tangent Spaces, Vector Fields, and Flows 
For each « € M, we have the tangent space to M at x, denoted TM. Given 


a smooth coordinate system 21,...,%, on M, the vectors 
O 0 
—,..-, = (21.1) 
Ox, OXn 


form a basis for the tangent space at each point. A vector field X on M 
is map assigning to each point x € M an element Xz of T;M. A vector 
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field X is smooth if the coefficients of X in a basis of the form (21.1) are 
smooth functions, for every smooth coordinate system. As in Exercise 14 
in Chap. 2, we think of a vector field as a first-order differential operator 
satisfying the Leibniz rule: 


X(fg) =X(f)g+ fX(g). 


Given a smooth vector field X on M and a point x € M, there exists a 
curve 7x : (a,b) + M such that y,(0) = 2 and 


dx 
He eC): 


Any two such curves agree on the intersection of their intervals of definition. 
There is a largest interval (a™**, b™®*) on which such a curve can be defined. 
If, for each « € M, we have a}"** = —oo and b** = +00, we say that the 
vector field X is complete. If M is compact, then each smooth vector field 
on M is complete. We may assemble the curves 7, into the flow ® generated 
by X, defined as 


©, (2) _ Ye(t), 


whenever a™** < t < b™**. If t does not belong to (at**, b@**), then ®,() 
is not defined. The flow ® satisfies 


Po(x) =a. (21.2) 


Furthermore, if x is in the domain of ®; and ®;() is in the domain of ®,, 
then x is in the domain of ®,,, and 


},(®;(x)) = O44 (z). (21.3) 


In the other direction, given a family of maps ©® satisfying (21.2) and 
(21.3) and appropriate domain properties, there is a unique vector field X 
such that ® is the flow generated by X. In particular, if ®,(a) is defined 
for all x and t, is smooth as a map of M x R into M, and satisfies (21.2) 
and (21.3), there is a unique complete vector field X such that ® is the 
flow generated by X. 


21.1.2 Differential Forms 


For each x, the tangent space T;,M is an n-dimensional real vector space. 
The dual vector space to T;,M is the cotangent space to M at x, denoted 
TM. Given a smooth function f on M and a point x € M, the differential 
of f at x is the element of T* M given by 
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for each X € T), f. In particular, in any local coordinate system x1,...,%n, 
the elements dx,,...,dx,, satisfy 
0 
dx; (=) = Ojk- 
Thus, the elements dx,,...,dv,, form a basis for TM at each point. For 
any smooth function f, we have 
of 
df = ——dz;. 21.4 

A k-form aon M is a mapping assigning to each point x € M ak-linear, 
alternating functional a, on T,.M. A k-form is smooth if a(X1,...,Xx) isa 
smooth function on M for each k-tuple of smooth vector fields X1,..., Xx 


on M. In particular, if f is a smooth function, then df is a smooth 1-form. 
If a is a smooth k-form and X a smooth vector field, we may define the 
contraction of a with X, which is the (k — 1)-form ixa given by 


(txa:)(Xq,...,Xp-1) = a(X, X1,...,XK-1). 


Given a k-linear form ¢ on a vector space V, define the antisymmetriza- 
tion AS(¢) of d by 


AS(¢)(v1, tee , UK) = x sign(7)&(Vo(1),Vo(2); see 5 Use): 


o€S: 


where S; denotes the permutation group on k elements. Given a k-form a 
and an l-form 6 on M, let a @ 8 be the (k + 1)-linear form on each TM 
given by 


(a @ B)(X1,...,X eq) = a(X1,..., XK) B(Xeg1,---, Xe41)- 
Then let a A § denote the (k + 1)-form given by 
aA B= AS(a® Bp). 
In particular, if a and 6 are 1-forms, then a A £ is the 2-form given by 
(aA B)(X,Y) = a(X)6(Y) — a(Y)B(X). 


In a smooth coordinate system 21,...,%p, a smooth k-form a can be ex- 
pressed uniquely as 


Q = Gj,,...,5,(@) dj, A-+-Adx;,. 


A 2-form w on M is said to be nondegenerate if w defines a nondegenerate 
bilinear form on each T,,M. More explicitly, this means that for each 2 © M 
and each nonzero X € T,,M, there exists a Y € T,M such that 


w(X,Y) £0. 
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Suppose a is a smooth k-form on M and S is a compact, oriented, k- 
dimensional submanifold-with-boundary of WM. Then one can define the 
integral of a over M. There is a map d, called the exterior derivative, 
mapping smooth k-forms to smooth (k + 1)-forms and having the property 


that 
| dB = B (21.5) 
s as 


for every compact, oriented, k-dimensional submanifold-with-boundary S 
of M and every (k—1)-form 6 on M. Here OS is the boundary of S, with the 
natural orientation induced by the orientation on M. The relation (21.5) is 
known as Stoke’s theorem. A k-form a is said to be closed if da = 0. 

The exterior derivative may be computed in coordinates by the formula 


) 
d(f dx;, \---Adx;,) = sedi Nd, A+++ A da;,. 


A coordinate-invariant formula for the exterior derivative of a k-form a is: 


k+1 
Gt MG tien Mpg = (PTO ices Maye ou a) 


j=l 
+ S0(-1)Ha((X;, Xi], X1, poe ee mis gee), 
j<l 


where x; indicates that the X,; term is omitted and where [X,, Xi] is the 
commutator of X; and X as first-order differential operators. In particular, 
if a is a 1-form, we have 


(da)(X,Y) = X(a(Y)) — Y(a(X)) — a([X, Y]). (21.6) 
A key identity satisfied by the exterior derivative is 
d(da) = 0 


for all k-forms a. Conversely, if 6 is a closed (k+1)-form (i.e., d8 = 0), then 
B can be expressed locally in the form 8 = da for some k-form a. More 
precisely, if 6 is closed, then for any x € M there exists a neighborhood U of 
xz and a k-form a defined on U such that 6 = da on U. If M satisfies certain 
topological conditions, then each closed k-form a@ on M can be expressed 
globally in the form a = d8. In particular, if MW is simply connected, then 
each closed 1-form ( can be expressed globally in the form {8 = df for some 
smooth function (i.e., 0-form) f. 

If X is a vector field and a is a k-form, we may define the Lie derivative 
of a in the direction of X, denoted Lya, as follows: 


d 
=. @* 
Exa= GIB] 
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where ®; is the flow generated by X and (®;)(a) is the pullback of a by 
®,. The Lie derivative may be computed using the formula 


Ly =tixod+doix. (21.7) 


21.2 Mechanics on Symplectic Manifolds 


The reader is warned that sign conventions in the subject of Hamiltonian 
mechanics are not consistent from author to author. 


21.2.1 Symplectic Manifolds 


A symplectic manifold is, roughly, a manifold with enough additional struc- 
ture to allow one to define the Poisson bracket of two functions. 


Definition 21.1 A symplectic manifold is a smooth manifold N_ to- 
gether with a closed, nondegenerate 2-form w on N. If (Ni,w1) and (No, we) 
are symplectic manifolds, a map ® : Ny + No is a symplectomorphism 
if ® is a diffeomorphism and in addition 


D* (w2) = 1. 


It is not hard to see that every symplectic manifold must be even dimen- 
sional, for the simple reason that an odd-dimensional vector space does not 
admit a nondegenerate, skew-symmetric bilinear form. 

Throughout this chapter, N will always denote a symplectic manifold of 
dimension 2n with symplectic form w. 

We now show that the cotangent bundle of any manifold has the struc- 
ture of a symplectic manifold in a canonical way. Suppose 21,...,2p is 
a coordinate system defined on an open set U C M. Then at each point 
x € U, an element ¢ of T*M can be expressed uniquely in the form 


co) = Pj dx; 
for a sequence pj,...,Pn of real numbers. The quantities 71,...,x%, and 
P1,--+,Pn constitute a coordinate system on 7~!(U). We refer to a coordi- 


nate system of this sort as a standard coordinate system on T*M. 


Example 21.2 For any smooth manifold M, define a 1-form @ on the 
cotangent bundle T*M by 


O(X) (2,4) = (Ts (X)) 


for each tangent vector X € Tiz,4)(I*M), where tm : T*M — M is the 
canonical projection. Then the 2-form w := dé is closed and nondegenerate. 
We refer to 6 and w as the canonical 1-form and the canonical 2-form on 
T*M, respectively. 
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Proof. Using a coordinate system {xz;} on X and the associated stan- 
dard coordinate system {x;,p;} on T*M, the projection m is given by 
m(x,p) = x. Meanwhile, a tangent vector X to T*M is expressible as a 
linear combination the 0/0xz;’s and 0/0p,’s. Thus, 


0 0 0 
(oa ape) = 05 42) (ee) 


What this means is that 
0=p; dx;, 


where the x;’s are now viewed as functions on T* M rather than on M. We 
have, then, 
w = dé = dp; \ dz;. 


It is now easy to see that w is nondegenerate (Exercise 1). m 


21.2.2 Poisson Brackets and Hamiltonian Vector Fields 


If w is nondegenerate, then it gives a canonical identification of T,N with 
T;N at each point, by identifying a vector X in T,.N with the linear func- 
tional w(X,-) in T7.N. We can then transfer the bilinear form w from T,.N 
to TN by means of this identification. We denote the resulting bilinear 
form on TiN by wt. 


Definition 21.3 If f and g are smooth functions on N, define the Pots- 
son bracket {f,g} of f and g by 


{f,g} = —w"*(df, dg). 


In particular, if 1 denotes the constant function on N, then {1, f} = 
{f,1} =0 for all smooth functions f. 


Example 21.4 If w is the canonical 2-form on T*M, then the associated 
Poisson bracket may be computed in standard coordinates as 


_ Of Og Of Og 
Ox; Op; Op; Ox; 





{f, 9} 


for all smooth functions f and g on T*M. 


a 
: Ox;’ 


has a value of —1 on the vector 0/Op; and a value of 0 on all the other 
basic partial derivatives. This means that w(0/0x;,-) = —dp,;. Similarly, 


Proof. The linear functional 
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w(0/Op;,-) = dz;. We may thus compute, for example, that 


Ae oe 
<x Ox,’ Op; 


= w"(—dp,, dz;) 
=w '(dax;, dp;). 


Meanwhile, w~!(dx;,dz,%) = w~'(dp;,dpz) = 0 and w 4(dp;,dr,) = 0 
when j #k. Thus, we compute that 








4 f OF Og Og 

= 1 . : \ 

{fg} =-w (= dx; + Bp, th Dag ee | pp) 
_ Of Og 5, —- Of Og . 
Ox; OPK af Op; Ox, ats 





which reduces to the claimed expression. 


Proposition 21.5 For any smooth functions f,g,h on N, we have 


{9, f} = —{ fio} 


and 


{f,gh} = {f,g}h + off, h}. 


Proof. Since w is skew-symmetric on the tangent space to N at each point 
and w~! is obtained from w by means of an isomorphism of tangent and 
cotangent space, w! is askew-symmetric form on the cotangent space. The 
skew-symmetry of the Poisson bracket follows. The second relation follows 
from the Leibniz product rule for d(gh) together with the bilinearity of 
wt 


Definition 21.6 Jf f is a smooth function on N, let Xf be the unique 
vector field on N such that 


df = w(Xy,-). (21.8) 
We call X¢ the Hamiltonian vector field associated to f. 


That is to say, X corresponds to df under the isomorphism between 
tangent and cotangent spaces established by w. 


Proposition 21.7 For all f and g, 


X (9) = {hg} = —XQ(f). 


Furthermore, 


w(Xp,Xg) = th gt. 
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Proof. For each z € N, we are using w to identify T,N with T3N. Equa- 
tion (21.8) says that under this identification, Xy is identified with df. 
Thus, 


—w~" (df,dg) = —w(X5,Xq) = —df(Xq) = —X,(f). 


Thus, {f,g} = —X,(f), as claimed. A similar argument with the roles of 
f and g reversed gives the claimed relationship between X;(g) and {g, f}. 
Finally, 


w(X7,Xq) = df(Xg) = Xq(f) = -th gt, 


as claimed. @ 


Definition 21.8 For any smooth function f on N, the Hamiltonian 
flow generated by f, denoted ®/, is the flow generated by the vector field 
—Xf. 


In the case N = T*R” & R?", this definition agrees with our notation in 
Sect. 2.5. 


Proposition 21.9 For any smooth function f on N, the Hamiltonian flow 
&! preserves w. 


Proof. In general, a flow ® preserves a differential form a if and only if 
the Lie derivative Lxa = 0, where X is the vector field generating ®. In 
our case, since w is closed, we have, by (21.7), 


Lx,wW = dlix, | = df = 0, 
since ix,W is, by the definition of X+, equal to df. ™ 


Proposition 21.10 For any smooth functions f,g,h on N, the Jacobi 
identity holds: 


ee {g,h}t + {9 {i nine + {h, {ight =0. 


This result shows that the space of smooth function on N forms a Lie 
algebra under the Poisson bracket. The proof of Proposition 21.10 relies on 
Proposition 21.9, which in turn relies on the fact that w is closed. 
Proof. Since the Hamiltonian flow ®/ preserves w, it also preserves w 
and thus 


1 


w'(d(go &f),d(ho &{)) =w*(dg, dh) o &f, 


or, equivalently, 
{go ®{,ho Bf} = {g,h} 0 ®. 


Differentiating this relation with respect to t at t = 0 gives 


{-X4(g), A} + (9, -Xp(htt = Xe (9, PY), 
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or, equivalently, 


={f gh; h} + {9, Ls h}} = =F; {9; h}}. 


After moving —{f,{g,h}} to the left-hand side of the equation and using 
the skew-symmetry of the Poisson bracket, we obtain the Jacobi identity. 
a 


Proposition 21.11 For any smooth functions f and g on N, the Hamiil- 
tonian vector fields X¢ and X, satisfy 


[Xp, Xg] = Xr pg}. 


Proof. See Exercise 3. @ 


21.2.8 Hamiltonian Flows and Conserved Quantities 


We have seen (Proposition 21.9) that if f is a smooth function, then the 
flow generated by Xy preserves w. We have the following partial converse 
to this result. 


Proposition 21.12 Suppose ® is the flow generated by a vector field —X 
on N. If ® preserves w then X can be represented locally in the form X = 
Xy for some smooth function f on N. If N ts simply connected, the function 
f exists globally on N. 


Proof. The statement that ® preserves w can be expressed infinitesi- 
mally as 
Lxyw =0. 


Since also w is closed, (21.7) tells us that 
d(ixw) = 0. 


Since ixw is closed, this 1-form can be expressed locally as ixw = df for 
some smooth function f, which says precisely that X = Xy. If N is simply 
connected, then every closed 1-form can be expressed globally as df, for 
some smooth function f. 

A flow of the sort in Proposition 21.12 is said to be locally Hamiltonian. 
Such a flow is said to be (globally) Hamiltonian if the function f in the 
proposition can be defined on all of N. (Compare Definition 21.8.) If ® isa 
Hamiltonian flow, the function f such that ® = Of is called a Hamiltonian 
generator of ®. If N is connected, then any two Hamiltonian generators of 
® must differ by a constant. 

To see that, in general, f is only defined locally, consider the symplectic 
manifold S! x R, with symplectic form w = dé dz, where ¢ is the angular 
coordinate on 9! and z is the linear coordinate on R. Note that the 1-form 
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d¢ is independent of the choice of a local angle variable on S$", since any two 
such angle functions differ by a constant (an integer multiple of 27). Thus, 
d@ is a globally defined, smooth 1-form, even though there is no globally 
defined, smooth angle function ¢. Define a flow ® by 


®,(¢, x) = (d, 2 + t). 


This flow certainly preserves w, since dx is invariant under translations. 
The flow ® is generated by the vector field —X = 0/Ox, and 


w(—0/Ox,-) = dd. 


As we have already noted, however, there is no globally defined function ¢ 
whose differential is dd. 

Although any smooth function on a symplectic manifold N generates a 
Hamiltonian flow, in physical examples there is usually one distinguished 
function with a Hamiltonian flow that is thought of as “the” time-evolution 
of the system. 


Definition 21.13 A Hamiltonian system is a symplectic manifold N 
together with a distinguished Hamiltonian flow ®", generated by smooth 
function H on N, called the Hamiltonian of the system. A function 
f is called a conserved quantity for a Hamiltonian system (N,®") if 
f(®# (x)) is independent of t for each fixed x € N. 


As in the R?” case, conserved quantities are useful in understanding the 
nature of the dynamics. See the discussion following Corollary 2.26. 


Proposition 21.14 For any Hamiltonian system (N,®"), we have 


d 
af (Pe (2) = {fF Hee), 


for all z © N, or, more concisely, 


df 


In particular, a smooth function f on N is a conserved quantity for a 
Hamiltonian system ©" if and only if {f,H} = 0. 


Proof. For the flow generated by any vector field X, we have 


d 


“f(®i(2)) = Xo. 


If X = —Xy, then by Proposition 21.7, we have the claimed result. m 
Proposition 21.15 A smooth function f is a conserved quantity for a 


Hamiltonian system (N,®") if and only if H is invariant under the Hamil- 
tonian flow ®f generated by f. 
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Proof. By the previous proposition, H is invariant under the flow generated 
by f if and only if {H, f} = 0, which holds if and only if {f, H} = 0, which 
holds if and only if f is a conserved quantity. ™ 


21.2.4 The Liouville Form 


A symplectic manifold N has a natural volume form, which allows us to 
formulate an analog on N of Liouville’s theorem (Theorem 2.27). 


Definition 21.16 If N is a 2n-dimensional symplectic manifold, the 
LInouville form on N is the 2n-form X given by 


where w" =wA-+:Aw. 


Since w is, by assumption, a nondegenerate form on each tangent space 
T.N, it is not hard to check that A is a nonvanishing (2n)-linear form on 
each T,N. Thus, A determines an orientation on N. Given a compactly 
supported continuous function f on N, we can define the integral of f 
over N, computed with respect to the orientation determined by 4 itself. 
Using the version of the Riesz representation theorem for locally compact 
topological spaces, one can show that there is a unique measure, called 
the Liouville volume measure, for which the integral of every continuous 
compactly supported function f is given by fy f A. 

We are now ready to state the general form of Liouville’s theorem. 


Theorem 21.17 (Liouville’s Theorem) For any smooth function f on 
N, the Hamiltonian flow ®f preserves X. 


Proof. The flow ®f will preserve ) if and only if the vector field X satisfies 
L£x,A = 0. But 
Ly, A= : L 
X,;A= il x,wW)ANwA+-Aw 


tw (Lx,w)AwA+-Aw 





Since we have already shown (Proposition 21.9) that Lx,w = 0, we see 
that £xy,A=0. m 


21.3. Exercises 


1. Show that the canonical 2-form w on T*M is nondegenerate. 


Hint: Work in standard coordinates {x;, p;}. 
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2. Show that if 6: M —- M is a diffeomorphism, then the induced map 
®* :T*M > T*M is a symplectomorphism. 


3. Using Proposition 21.7 and the Jacobi identity for the Poisson bracket, 
verify that 
[Xp,Xq] = X{F,9} 


for all smooth functions f and g on N. 


4. If N is compact, show that 


[ttapr=o 


for all smooth function f and g on N. 


Hint: Apply Liouville’s theorem to the flow of : 
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Geometric Quantization on Euclidean 
Space 


22.1 Introduction 


In this chapter, we consider the geometric quantization program in the 
setting of the symplectic manifold R?”, with the canonical 2-form w = 
dp; \ dz;. We begin with the “prequantum” Hilbert space L?(R?”) and 
define “prequantum” operators Qpre(f). These operators satisfy 


Qore({f,9}) = 7 [Qpre(f): Qpre(9) 


for all f and g. Nevertheless, there are several undesirable aspects to the 
prequantization map that make it physically unreasonable to interpret it 
as “quantization.” To obtain the quantum Hilbert space, we reduce the 
number of variables from 2n to n. Depending on how we do this reduction, 
we will obtain either the position Hilbert space, the momentum Hilbert 
space, or the Segal-Bargmann space. Each of these subspaces is preserved 
by the prequantized position and momentum operators, and by certain 
other operators of the form Qpre(f). 

Although the material in this chapter is a special case of what we do in 
Chap. 23, doing this case first allows us to get a feeling for the methods and 
results of geometric quantization quickly, without needing to develop the 
full machinery of line bundles, connections, and polarizations over general 
symplectic manifolds. In any case, we would need to carry out most of the 
calculations in this chapter eventually, as standard examples of the general 
theory. 
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Although this chapter does not require the full machinery of symplectic 
manifolds, we will make use of the notions of 1-forms and 2-forms on R2”, 
along with the notion of the differential of a 1-form. In particular, the 
expression (21.6) for the differential of a 1-form will be used. 

The reader should be warned that sign conventions in geometric quan- 
tization are not consistent from author to author. The sign conventions 
used here are chosen to maintain consistency with the physics literature. 
In particular, we could eliminate an annoying minus sign in the definition 
of the holomorphic subspace if we were willing to allow the function p; to 
quantize to ii 0/0x,;. Since, however, the convention P; = —ih 0/0x; is 
universal in the physics literature, we have chosen to be consistent with 
that convention and to accept some slightly inconvenient sign choices else- 
where. We continue to follow the summation convention, in which repeated 
indices are always summed on. 


22.2 Prequantization 


Ideally, a quantization procedure Q, mapping functions on a symplectic 
manifold N to operators on some Hilbert space H, should satisfy the 
following properties. First, Q(f) should be self-adjoint whenever f is real 
valued. Second, we should have Q(1) = J, where 1 is the constant function. 
Third, Q({f,g}) should be equal to [(Q(f), Q(g)]/(éH). Fourth, there should 
be some sort of “smallness” assumption. In the case N = R?”, for exam- 
ple, we may require that H should be irreducible under the action of the 
(exponentiated) position and momentum operators. (See Definition 14.6.) 
Although Groenewold’s theorem (Theorem 13.13) suggests that it is unre- 
alistic to expect to find a quantization procedure that satisfies all of these 
properties exactly, we try to come as close as possible. 

Throughout this chapter, we follow the convention of thinking of a “vec- 
tor field” on R% as a first-order differential operator, as in Exercise 14 in 
Chap. 2. Given, for example, the vector-valued function 


X= (2a, + £2,%1%2) 


on R?, we identify X with the operator of “differentiation in the direction 
of X,” that is, with the following first-order differential operator: 


O 0 
X= (221 + a2)5 + la 


In particular, given a smooth function f on R?”, the Hamiltonian vector 
field X+ associated to f is thought of as a differential operator: 


of a of a 
Ox; Op; Op; bx,’ 





X= (f,}= (22.1) 
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acting on C®°(R2”). (Compare Proposition 21.7.) By Proposition 21.11, the 
commutator (as differential operators) of two Hamiltonian vector fields Xf 
and Xq is X¢¢,9}. Thus, the operators 7h.X ¢ satisfy the desired commutation 
relations: 
[ihX p, MX q] = («h)?X 4,9) = (6M) (EHX (y.9))- 

It is tempting, then, to define a (pre)quantization map simply by tak- 
ing Q(f) = ihXy, viewed as a self-adjoint operator on the Hilbert space 
L?(R2”"). This map, however, does not satisfy Q(1) = I. If we to correct 
our definition to Q(f) = ihXy + f, where f means the operator of mul- 
tiplication by f, then Q(1) = J but the desired commutation property is 
destroyed. 

It is possible to achieve both Q(1) = J and the desired commutation 
relations by adding one more term as follows. If w = dp; A dz; is the 
canonical 2-form on R?”, let 6 be any symplectic potential for w, that is, 
any one-form with 

dO =w. (22.2) 


(We may, e.g., take 6 = pj;dx;.) For a smooth function f on R?”, define an 
operator Qpre(f), acting on C(R?"), by 


Opro(f) =i (X= 50K) +L (22.3) 


The expression f on the right-hand side of (22.3) means, more precisely, 
the operator of multiplication by f, and similarly for the function 6(X,). 
Note that since @ is a 1-form and X7 is a vector field, 6(X,y) is a function on 
R?”. The operator Qpre(f) is the prequantization of f and is to be viewed 
as an unbounded operator on L?(IR?”), where we refer to L?(R?”) as the 
prequantum Hilbert space. 

According to Exercise 1, any divergence free vector field on RY is a skew- 
symmetric operator on C&°(R%) C L?(R%). Meanwhile, each Hamiltonian 
vector field is divergence free, as we have already remarked in the proof 
of Liouville’s theorem (Theorem 2.27). Thus, for any smooth, real-valued 
function f on R?”, the operator Qpre(f) is at least symmetric. It can be 
shown that if X¢ is complete, meaning that the associated Hamiltonian flow 
is defined for all times, then Qpre(f) is actually self-adjoint on a natural 
domain. (See the discussion following the proof of Proposition 23.13.) 

As it turns out, the 6(Xy) term in (22.3) is precisely what is needed to 
restore the desired commutation relations, while still allowing Qpre(1) to 
equal the identity. 


Proposition 22.1 For all f,g € C®(R?"), we have 


F Qve(F), Qrve(g)] = Qprel LF 93). 


where the identity is to be understood as an equality of operators on C° 


(R?"). 
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Before proving this result, it is useful to understand the behavior of the 
expression X ¢ — (¢/A)0(X yf) occurring in the definition of Qpre(f). 


Definition 22.2 For any symplectic potential 0 and vector field X on R?", 
let Vx denote the covariant derivative operator, acting on C®(R?"), 
given by 


Ve=x = 4X). (22.4) 
Note that our prequantized operators can be written as 


Qpre(f) = ihV x, + f. 


Proposition 22.3 For any symplectic potential 6, let Vx denote the 
associated covariant derivative in (22.4). Then for all smooth vector fields 
X and Y on R?”, we have 


a 
[Vx, Vy] = VixXy] aad Zul%Y¥): (22.5) 


In particular, if X = Xs and Y = Xg, we have 


[Vx;,Vx,] a VX cra) + er gt. 


According to standard differential geometric definitions, the 2-form w/h 
on the right-hand side of (22.5) is the curvature of the covariant derivative 
V. For our purposes, the fact that [Vx;, Vx,| in not simply Vx,,,, is an 
advantage. The extra term in the formula for the commutator is just what 
we need to compensate for the failure of the operators 7hX f¢ + f to have 
the desired commutation relations. 

Proof. Using the easily verified identity [Vx, f] = X(f), we obtain 
Vx, Vv] — Vix) =~ (XOX) ~ ¥O(X)) - (KY) 
In light of (21.6), the right-hand side becomes —(i/h)(d0)(X,Y), where 
dd=w. w 
We may now easily prove Proposition 22.1. 
Proof of Proposition 22.1. Using Proposition 22.3, we obtain 


1 
7 lit x, + fikV x, +9] 


= it) (Wren + Ea) + Xp(0) — Xo) 
= VV x7. — 10,0} +169} 4+ {hgh 


which reduces to what we want. 
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Example 22.4 If 0 = pjdz,;, the prequantized position and momentum 
operators are given by 


0 
re( Lj) = @j + ih—— 
Qpre(xj) = 2; OD; 


. O 
Qpre(pj) = Ee 
These operators are essentially self-adjoint on CS°(IR?") and_ their 
self-adjoint extensions satisfy the exponentiated commutation relations of 


Definition 14.2. 


Proof. We compute that X,, = 0/Op; and that 6(Xz,) = 0, giving the 
indicated expression for Qpre(#;). Meanwhile, X,, = —O0/0x; and 0(Xp,) = 
—p;. There is a cancellation of the @(X,,) term in the definition of Qpre(p;) 
with the p; term, leaving Qpre(p;) = ihXp,. 

The essential self-adjointness of the operators follows from Proposition 
9.40. To verify the exponentiated commutation relations, we calculate the 


associated one-parameter unitary groups as 


(et! @rre(#3) qh) (x, p) = eb (x, p — the;) 
(ci#@rre(Ps) 4h) (x, p) = W(x + the;, p), (22.6) 


where we now let Qpre(j) and Qpre(pj) denote the unique self-adjoint 
extensions of the given operators on CS°(R?”). (Compare Proposition 13.5.) 
The exponentiated commutation relations can now be easily verified by 
direct calculation. 

As we have presented things so far, the concept of covariant derivative, 
and thus also of prequantization, depends on the choice of symplectic po- 
tential 0. This dependence is, however, illusory; we will now show that the 
prequantum maps obtained with two different symplectic potentials are 
unitarily equivalent. 


Proposition 22.5 Suppose that 6, and @2 are two different symplectic po- 
tentials for the canonical 2-form w, so that d(@'—6?) = 0. Let the associated 
covariant derivatives be denoted by V! and V? . Choose a real-valued func- 
tion y so that dy = 6' — @ and let Uy be the unitary map of L?(R?") to 
itself given by 
U,p = et 1/hy,, 

Then for every vector field X, we have 

U,VxU,* = VX. (22.7) 
Af Ort) j = 1,2, are the associated prequantization maps, it follows that 


Uy Qore(f Uy» = Qire(f): (22.8) 


The map U,, is called a gauge transformation. 
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Proof. The operation of multiplication by 6'(X) commutes with 
multiplication by e~*7/", whereas 


X (ef) = oF XY + LeUMX (yy, 


Since X (y) = (dy)(X) = 61(X) — 6?(X), we obtain 


Vi (et7/ Fah) = eir/h (x 4 =X (7) _ 0X) w 


Multiplying both sides of this equality by e~‘7/” gives (22.7). Equation 
(22.8) follows by observing that multiplication by f commutes with multi- 
plication by e7"7/", 


22.3 Problems with Prequantization 


Given the naturalness of the prequantization construction, it is tempting 
to think that prequantization could actually be considered as quantization. 
Why not take our Hilbert space to be L?(R?") and the quantized operators 
to be Qpre(f)? To answer this question, we now examine some undesirable 
properties of prequantization. 

In the first place, the Hilbert space L?(R?”) is very far from irreducible 
under the action of the quantized position and momentum operators, in 
contrast to the ordinary Schrédinger Hilbert space L?(IR"), which is irre- 
ducible, by Proposition 14.7. Indeed, in Sect. 22.4, we will construct a large 
family of invariant subspaces. (See Proposition 22.13.) 

In the second place, the prequantization map is very far from being mul- 
tiplicative. Of course, since quantum operators do not commute, we cannot 
expect any quantization scheme Q to satisfy Q(fg) = Q(f)Q(g) for all f 
and g. Nevertheless, the standard quantization schemes we have considered 
in Chap. 13 do satisfy this relation for certain classes of observables f and 
g. In the Weyl quantization, for example, we have multiplicativity if f and 
g are both functions of x only, independent of p (or functions of p, inde- 
pendent of x). For the prequantization map, however, we almost never have 
multiplicativity, for the simple reason that Qpre(fg) is a first-order differ- 
ential operator, whereas Qpre(f)Qpre(g) is second-order, provided there is 
at least one point where X+ and X, are both nonzero. 

In the third place, the prequantization map badly fails to map positive 
functions to positive operators. Although most of the quantization schemes 
in Chap. 13 do not always map positive functions to positive operators, they 
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somehow come close to doing so. Indeed, Qweyl, Qwick, and Qanti—Wick 
all map the harmonic oscillator Hamiltonian to a non-negative operator, 
since a*a + (1/2)I, a*a, and aa* are all non-negative. (See Exercise 4 in 
Chap. 13.) By contrast, the prequantized harmonic oscillator Hamiltonian 
has spectrum that is unbounded below, as we now demonstrate. 


Proposition 22.6 Consider a harmonic oscillator Hamiltonian of the 
form 


H(2,p) = a (p? + (mwa)”) . 


Then for each integer n, the number nhw is an eigenvalue for Qpre(H). 


Note that n in the proposition is allowed to be negative, so that the 
spectrum of Qpre(H) is not even bounded below. On the other hand, in 
Sect. 22.5, we will consider a certain closed subspace H, of the prequantum 
Hilbert space, which is one candidate for the quantum Hilbert space. For 
appropriate choice of a, the space Ha is invariant under Qpre(H) and the 
restriction of Qpre(H) is self-adjoint with spectrum nhw, where n ranges 
over the non-negative integers. See Proposition 22.14. And finally, when 
we introduce half-forms in Sect. 23.7, we will finally restore the spectrum 
(n+ 1/2)hw, where n ranges over the non-negative integers, that we found 
in Chap. 11. 

Proof. We can write H as 


1 
H(z, p) a aa +y?), 


where y = mwa. The flow associated to this Hamiltonian consists of rota- 
tions in the (y,p)-plane. If we choose our symplectic potential to be 


1 


1 
0= 5 (p dx — x dp) = =~ (p dy — y dp), 


then the 0(X 7) term in Qpre(H) cancels with the H term, leaving 


Qpre(H) = ihX x 


Now, if ¢ denotes the angular variable for polar coordinates in the (y, p)- 
plane, then y 0/0p — p 0/Oy is just 0/0¢. Thus, we can find eigenvectors 
for Qpre(H) of the form 


Un(r,) = f(rje'”? 
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where n is an integer and f is an arbitrary function with J |f (r)|? r dr<oo. 
rT] 

The conclusion of the matter is that it is not physically reasonable to 
use prequantization as our quantization scheme. Instead, we will pass to 
a “smaller” Hilbert space on which the position and momentum operators 
act irreducibly. 


22.4 Quantization 


To obtain a Hilbert space that can be thought of as giving us a “quanti- 
zation” (as opposed to a prequantization) of R?”", we restrict ourselves to 
a subspace of the prequantum Hilbert space. The idea is that we should 
be using only half of the variables on R?”. We might, for example, restrict 
ourselves to functions that depend only on the position variables and are 
independent of the momentum variables. Now, the space of functions w that 
are, say, independent of p in the ordinary sense (i.e., 7)(x, p) = w(x, p’)) 
is not invariant under gauge transformations (the maps U, in Proposi- 
tion 22.5). The gauge-invariant notion of being independent of p is that 
the covariant derivatives of ~ should be zero in the p-directions. Similarly, 
we may consider spaces of functions with covariant derivatives that are are 
zero in some other set of n directions. 


Definition 22.7 Fix a symplectic potential 0. Define the position sub- 
space as the subspace of C®(R?”) consisting of functions w for which 


Voa/ap;¥ =9 


for all j. Similarly, define the momentum subspace as the subspace of C° 
(R?") consisting of functions w for which 


Voa/ox; = 9 


for all j. Finally, define the holomorphic subspace with parameter a to 
be the subspace of C*°(R?") consisting of functions w for which 





Vayjaz;) =90 
for all j, where z; =x; —iap; and where 0/0z; and 0/0zZ; are defined by 
0 1/0 i Oo a) 1/0 i Oo 
Oz; 2 (= a 7) " 02; 2 (= 2) , ee) 


The operators 0/0z; and 0/02; are nothing but the usual complex deriva- 
tive operators on C” written in terms of the variables x and p, where we 
identify R?” with C” by the map (x, p) 4 x — iap. 

Of course, the exact form of the various subspaces in Definition 22.7 
depends on the choice of symplectic potential. It is convenient to use the 
symplectic potential 0 = p; dz;. 
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Proposition 22.8 Take the symplectic potential 0 = p; dx;. Then the 
position, momentum, and holomorphic subspaces may be computed as fol- 
lows. The position subspace consists of smooth functions y on R2” of the 
form 

v(x, P) = (x), 
where @ is an arbitrary smooth function on R". The momentum subspace 
consists of smooth functions w of the form 


(x, p) = e*P/"4(p), (22.10) 


where @ is an arbitrary smooth function on R”. Finally, the holomorphic 
subspace consists of functions of the form 


U(X, p) = F(z1,-.-,2)e7 oP /2A), (22.11) 


where F is an arbitrary holomorphic function on C” and where z; = x; — 
1p ;. 

Proof. Since 6(0/0p;) = 0, we have Va/ap, = 0/Op;, so that functions 
that are covariantly constant in the p-directions are actually constant in 
the p-directions. Meanwhile, 6(0/0x;) = p; and so 


a 


Now, any function w on R?” can be written in the form e™*P/"¢(x, p) for 
some other function ¢. If we use this form to compute Va/ap,¥, there is a 
convenient cancellation, giving 


ad 
; _ ix-p/h VY 
(Va/az;¥)(x, P) =e ae 
Thus, Va/azr,; = 0 for all j if and only if ¢ is independent of x. 
Finally, we note that 6(0/02Z;) = p;/2, so that 


) i 
Va/az; = Oz; = ye 


Any function 7 on R2” can be written in the form (x, p) = e~lPI’/2") F 
for some other function F’, where we note that 


eo olpl?/(2h) = exp ae _ 23)" /(8anh) 
H) 


Thus, 


2 .-alpP/(2h) — 74 —%) -alpl*/@2n) — 4 .-alpl?/(2n), 


OZ; 4ah 2h 
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When we compute Vo/az,;~ using the indicated form, there is another 
convenient cancellation, giving 


2 
(Voyaz,6)(x, p) = 672 _ 
Thus, Va/az,v = 0 for all 7 if and only if F' is holomorphic as a function 
of the variables z; = x; —iap;. & 

From the physical standpoint, we do not merely want a vector space of 
functions, but a Hilbert space. It is natural, then, to look at functions of the 
forms computed in Proposition 22.8 that belong to L?(R?”). In the case of 
the position and momentum subspaces, we encounter a serious problem: 
There are no nonzero functions of the indicated form that are square inte- 
erable over R?”. After all, if 7 is in the position subspace, then 7(x, p) is 
independent of p and the integral of ale over the p-variables will be infi- 
nite, unless 7 is zero almost everywhere. If 7 is in the momentum subspace, 
ale is independent of x and we have a similar problem. 

The solution to this problem is to integrate not over R?” but over R”. 
Although the “proper” way to make this change of integration is to intro- 
duce the notion of “half-forms,” as in Chap. 23, we will content ourselves 
in this chapter with the following simplistic rule: integrate only over the 
variables on which |y|? depends. If we want to get a Hilbert space (not just 
an inner product space), we must also allow functions of the specified form 
that are square integrable but not necessarily smooth. We may therefore 
identify the position Hilbert space and momentum Hilbert space as follows. 


Conclusion 22.9 The position Hilbert space is the space of functions on 
R?” of the form 
v(x, P) = (x), 


where ¢ € L?(R"). The norm of such a function is computed as 


2 2 
Iwi? =f leGol? dx 
R” 
The momentum Hilbert space is the space of functions on R?” of the form 


(x, p) = e™P/"d(p), 


where ¢ € L?(R"). The norm of such a function is computed as 


lw? = [ lom? ap, 


If we consider the holomorphic subspace, we find that it behaves better 
than the position and momentum subspaces, in that there exist nonzero 
functions of the form (22.11) that are square integrable over R?”, as we 
will see shortly. Furthermore, the space of functions of the form (22.11) 
that are square integrable over R?” form a closed subspace of L?(R?"), by 
the same argument as in the proof of Proposition 14.15. 
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Conclusion 22.10 The holomorphic Hilbert space consists of those 
functions w of the form (22.11) that are square integrable over R?”. If w 
is identified with the holomorphic function F in (22.11), then this Hilbert 
space may be identified with HL?(C",v), where 


v(z) = e lim 2|?/(ah) 


The space HL?(C”,v) is nothing but an invariant form of the Segal— 
Bargmann space (Definition 14.14), where here “invariant” means that 
the density v is invariant under translations in the real directions. This 
space can be identified unitarily with the ordinary Segal-Bargmann space 
HL?(C”, aan) as follows. Define a map © : HL?(C",p20n) 3 HL? 
(C",v) by 

W(F)(z) = (2rah)~"/2e-* / 40) F(z), (22.12) 


where 2? = 2? +--- +22. Then a simple calculation shows that 


IMP )iaceny = ff IPP naanla) de 


Since also e~? /(40%) ig holomorphic as a function of z, we see that V maps 


HL?(C", f2an) isometrically into HL?(C",v). The map W has an inverse 
given by multiplication by (27ah)"/ 262" /(4anh) showing that W is actually 
unitary. In particular, there exist many nonzero holomorphic functions on 
C” that belong to HL?(C”, v). 

We will regard any of the Hilbert spaces in Conclusions 22.9 and 22.10 
as our quantum Hilbert space. These spaces are to be compared to the pre- 
quantum Hilbert space L?(IR?”), which is in some sense “bigger,” consisting 
of functions of twice as many variables. Note there are multiple possibili- 
ties for the quantum Hilbert space. To reduce from the prequantum Hilbert 
space to the quantum Hilbert space, we have to choose a set of n variables, 
and then we look a functions that depend only on those n variables. In- 
deed, there are many other possibilities for the quantum Hilbert space; we 
have considered only the most common choices. We defer a discussion of 
the general theory until Chap. 23. 

The reader may wonder why we are using the definition z; = x; — iap; 
(a > 0) rather than z; = x;+iap;. If we repeated the preceding calculations 
with z; = x; + iap;, with a corresponding sign change in the definition of 
0/0Z;, we would find that 7 satisfies Va/az, for all j if and only if w is 
of the form 

W(x, p) = F(z1,..-,Zn)ewPl’/2P) (22.13) 


where F is holomorphic on C”. The change in sign in the exponent between 
(22.11) and (22.13) has a drastic effect: There are no nonzero holomorphic 
functions F’ for which the function w in (22.13) is square integrable over 
R?”. (See Exercise 3.) Unlike the situation with the position and momentum 
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Hilbert spaces, there is no natural way to alter the domain of integration 
to make a function of the form (22.13) have finite norm. 

We see, then, that there is a big difference between the definitions z; = 
x; — tap; and z; = x; + iap;. In the general framework of geometric 
quantization, we will have a similar distinction, where complex structures 
satisfying a certain positivity condition behave well, whereas the “opposite” 
complex structures behave badly. (See Definition 23.19 in Sect. 23.4.) 


22.5 Quantization of Observables 


Now that we have constructed our quantum (as opposed to prequantum) 
Hilbert spaces, we need to construct operators on these spaces. According 
to the standard geometric quantization program, the quantum operator 
associated with a function f is supposed to be simply the restriction to the 
quantum Hilbert space of the prequantum operator Qpre(f), provided that 
Qpre(f) leaves the quantum Hilbert space invariant. 


Proposition 22.11 The position, momentum, and holomorphic subspaces 
in Definition 22.7 are all invariant under the prequantum operators Qpre(x;) 
and Qpre(p;). Specifically, in the position subspace, we have 


in the momentum subspace, we have 


ix-p/h __ ,tx-p A To) 
Oprlas)(e*®/Mo(p)) =e! (in S® (p)) 


Qpre(py)(E*P/"(p)) = e*P/"(pjd(p)), 


and in the holomorphic subspace, we have 


Oprolas) Players" lM) = (anSE + 25F(@)) everson 
&j 


Qpre(pj)(F(z)e72 PI /(24)) = (-in=) e7alpl?/ (2h). 


02; 


Proof. See Exercise 4. m 

The invariance of the three subspaces under the prequantized position 
and momentum operators follows from a general result in geometric quanti- 
zation, that for a real-valued function f, the prequantum operator Qpre(f) 
preserves a given quantum space if and only if the Hamiltonian flow gen- 
erated by f preserves the polarization defining the quantum space. The 
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term “polarization” refers to the set of directions in which the elements of 
the quantum space are covariantly constant. In the case of the position, 
momentum, and holomorphic spaces, the set of such directions is the same 
at every point, which means that the polarization is invariant under trans- 
lations. But the Hamiltonian flows generated by x; and p; are nothing 
but translations in the —p,-directions and the x,;-directions, respectively. 
Of course, in this simple example, we can verify the invariance by direct 
computation, which also gives the indicated form of the operators on each 
subspace. 

Note also that in each case, the “preferred” functions act simply as mul- 
tiplication operators. In the position subspace, for example, the position 
operator Qpre(x;) acts simply as multiplication by x;, whereas in the mo- 
mentum subspace, the operator Qpre(p;) acts as multiplication by p;. Fi- 
nally, in the holomorphic subspace, the operator 


Qpre(Z;) (F(@)e-alel"/@n = (z;F'(z)) eal l?/(2h) 


where z; = x; — iap,, since the terms involving OF'/0z,; cancel. 
We now focus on the position Hilbert space and look for operators of the 
form Qpre(f) that leave the position subspace invariant. 


Proposition 22.12 The position subspace is invariant under Qpre(f) when- 
ever f is of the form 


f(x, Pp) = a(x) + b;(x)p; (22.14) 


for some smooth functions a and by,...,b, on R”. On the other hand, the 
position subspace in not invariant under the operator Qpre(p7). 


Proof. If f is of the form (22.14), calculation shows that 0(X)+ f = a(x). 
If we drop any terms in Xf involving 0/Op,, since these are zero on the 
position subspace, we end up with 


Qpre(F)(6(8)) = ih) 5 + a(x)6(x), (22.15) 


which is again in the position subspace. [There is no p-dependence in the 
coefficient of 0/Ox; in (22.15) because Of /Op; is independent of p.] On 
the other hand, direct calculation shows that the restriction to the position 
subspace of Qpre(f) is 


. O 
—2thp; 5 — pi, 
j 


which does not preserve the space of functions on R?” that are independent 
of p. 


480 22. Geometric Quantization on Euclidean Space 


It should be noted that the expression on the right-hand side of (22.15) 
is not a self-adjoint, or even symmetric, operator on L?(IR”), unless the 
vector field b(x) happens to be divergence free. (Even though the vector 
field X is divergence free on R?”, the way X; acts on functions that are 
independent of p is not necessarily a divergence free vector field on R”.) 
This undesirable feature of our quantization scheme is the result of our 
simplistic method of passing from L?(R?”) to L?(R”) in our derivation of 
Conclusion 22.9. When we do this reduction properly, using half-forms, we 
will obtain a self-adjoint operator. See Sect. 23.6. 

We now consider the behavior of the holomorphic subspace under the 
prequantized position and momentum operators. 


Proposition 22.13 For any a > 0, let Ha be the subspace of L?(R?") 
consisting of smooth functions w that satisfy Vajaz,W = 0, where 0/02; 
is as in (22.9). Then Hq is a closed subspace of L?(R2") and H, is in- 
variant under the one-parameter unitary groups generated by Qpre(xj) and 
Qpre(p;). Furthermore, Qpre(x;) and Qpre(p;) act irreducibly on Hq in the 
sense of Definition 14.6. 


For each a > 0, the holomorphic Hilbert space is a subspace of the 

prequantum Hilbert space invariant under the exponentiated position and 
momentum operators. Thus, the prequantum Hilbert space is far from being 
irreducible under the action of those operators. 
Proof. The invariance of H, is a simple calculation (Exercise 5). 
Irreducibility can be established by reducing to the previously established 
irreducibility of the Segal-Bargmann space under the operators T, in The- 
orem 14.16. To this end, we should check that the unitary map W in (22.12) 
intertwines products of exponentials of Qpre(x;) and Qpre(p;) with opera- 
tors of the form T, (with A replaced by 2ah). This is a straightforward but 
tedious calculation, and we omit the details. m 

We conclude this section with an example of a quantum subspace that is 
invariant under the (pre)quantized Hamiltonian of a harmonic oscillator. 


Proposition 22.14 Consider a harmonic oscillator with Hamiltonian 


1 
= (yp 2). 
om (p* + (mwa)?) 
Consider also the subspace H, in Proposition 22.13, with a = 1/(mw). 
Then the operator Qpre(H) leaves Hy invariant. Furthermore, the restric- 
tion of Qpre(H) to Ha has non-negative spectrum consisting of eigenvalues 
of the form nhw, where n ranges over the non-negative integers. 


Proposition 22.14 is a much more physically reasonable result for the 
spectrum of the quantization of the non-negative function H than on the 
full prequantum Hilbert space, where (Proposition 22.6) the spectrum of 
Qpre(H) is not even bounded below. When we introduce the “half-form 
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correction” in Sect. 23.7, we will finally be able to obtain the “correct” 
spectrum for the quantum harmonic oscillator, consisting of numbers of 
the form (n + 1/2)hw, n = 0,1,2,.... See Example 23.53. 

Proof. As in the proof of Proposition 22.6, we introduce the variable 
y = mwax. With a = 1/(mw), this gives z = (y — ip)/(mw). We use the 
symplectic potential 


Then 





le ee 
62) 2\P" a)” Qa” 
and so Va/az = 0/02 + z/(2ah). From this, we can easily check that the 


holomorphic subspace consists of functions of the form 


2 2 
=|2|?/(2ah) _ _(y* +P") 
F(zje F(z) exp { uk ft? (22.16) 


where F is holomorphic. 
Meanwhile, as in the proof of Proposition 22.6, we have 


0 0 
Qpre(H) = thw (uz - >) ’ 


which is just an angular derivative in the (y, p)-plane. Since the exponential 
factor in (22.16) is rotationally invariant, Qpre(H) only hits F. Meanwhile, 


O O PF y—ip\ dF 4 dF 1 
Yap Poy mo) "dz mw Plz mw 











_ _ dF 
=—7 5 Pa 
8 dF 
ee he 


Thus, 
—|z/? /(2ah dF lz)? /(2a 
Qpre(H)(F (ze H22°/22m)) = (noe) e-leP/@an), 


which is again in the holomorphic subspace. 

Finally, as in Proposition 14.15, the functions 2”, n = 0,1,2,..., form 
an orthogonal basis for the Hilbert space Hy. Each monomial z” is an 
eigenvector for the operator z d/dz with eigenvalue n. This establishes the 
claim about the spectrum of the restriction to Ha of Qpre(H). @ 

The operator F +> hwz dF/dz is self-adjoint on the holomorphic Hilbert 
space, in contrast to the operators in (22.15) in the case of the position 
Hilbert space. Indeed, self-adjointness is “automatic” in this case, because 
the holomorphic Hilbert space is actually a subspace of the prequantum 
Hilbert space, and the restriction of a self-adjoint operator to an invariant 
subspace is self-adjoint. 
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22.6 Exercises 


1. Consider the vector field 


on R?”, where the a;’s are smooth, real-valued functions. Show that 
X is skew-self-adjoint on CS°(RY) if and only if the divergence of X 
(i.e., the quantity Oa;/Ox,;) is identically zero. 


. Using the symplectic potential 6 = p dz, compute Qpre(xp?). Show 


that Qpre(zp”) is not in the algebra of operators generated by Qpre(2) 
and Qpre(P). 


Hint: Consider how Qpre(ap”) acts on functions that are independent 
of p. 


(a) Suppose F' is a holomorphic function on C such that 


| |F(z)|? dz <0, 
Cc 


where here dz denotes the 2-dimensional Lebesgue measure on 
C & R?. Show that F is identically zero. 
Hint: If F is not identically zero, use a power series argument 
to show that the L? norm of F over a disk of radius R tends to 
infinity as R tends to infinity. 

(b) Show that if a function of the form (22.13), with F holomorphic 
on C”, is square integrable, then F' must be identically zero. 


. Prove Proposition 22.11, using the explicit form of Qpre(#;) and 


Qpre(p;) in Example 22.4. 


Hint: In the case of the holomorphic subspace, express the operators 
0/0x,; and 0/Op; in terms of the operators 0/0z; and 0/02; in (22.9). 


. Show that the space of functions of the form in (22.11), where F' is 


holomorphic on C”, is invariant under the operators e”@»re(*7) and 
e't@pre(P3) computed in (22.6), for all t € R and j = 1,2,...,n. 
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Geometric Quantization on Manifolds 


23.1 Introduction 


Geometric quantization is a type of quantization, which is a general term 
for a procedure that associates a quantum system with a given classical 
system. In practical terms, if one is trying to deduce what sort of quantum 
system should model a given physical phenomenon, one often begins by 
observing the classical limit of the system. Electromagnetic radiation, for 
example, is describable on a macroscopic scale by Maxwell’s equations. On 
a finer scale, quantum effects (photons) become important. How should one 
determine the correct quantum theory of electromagnetism? It seems that 
the only reasonable way to proceed is to “quantize” Maxwell’s equations— 
and then to compare the resulting quantum system to experiment. 
Meanwhile, not every physically interesting system has R?” as its phase 
space. Geometric quantization, then, is an attempt to construct a quantum 
Hilbert space, together with appropriate operators, starting from a phys- 
ical system having an arbitrary 2n-dimensional symplectic manifold N as 
its phase space. To perform geometric quantization on N, one must first 
choose a polarization, that is, roughly, a choice of n directions on N in which 
the wave functions will be constant. If N = 7*M, then one may use the 
“vertical polarization,” in which the wave functions are constant along the 
fibers of T*M. For cotangent bundles with the vertical polarization, geo- 
metric quantization reproduces the “half-density quantization” of Blattner 
[4]. (See Examples 23.45 and 23.48.) Even for cotangent bundles, however, 
it is of interest to use polarizations other than the vertical polarization, as 
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we have seen already in the R” case. In the case of the cotangent bundle of 
a compact Lie group, for example, the paper [20] shows how quantization 
with a complex polarization gives rise to a generalized Segal-Bargmann 
transform. 

Some phase spaces, meanwhile, may not even be in the form of a cotan- 
gent bundle. In the orbit method in representation theory, for example, 
the relevant symplectic manifolds are “coadjoint orbits,” which typically 
are not cotangent bundles. [In the SU(2) case, for instance, these orbits are 
2-spheres with the natural rotationally invariant symplectic form.] In quan- 
tum field theory, meanwhile, one encounters Lagrangians that are linear, 
rather than quadratic, in the “velocity” variables. In such cases, the initial 
velocity is determined by the initial position, and one cannot think of the 
space of initial conditions as a (co)tangent bundle. Systems of this form can 
still be symplectic, but they are not cotangent bundles. Furthermore, it is 
common to think of compact symplectic manifolds (such as S? with a ro- 
tationally invariant symplectic form) as classical models of internal degrees 
of freedom, such as spin. 

To quantize these more general symplectic manifolds, one needs a more 
general approach to quantization. Given a symplectic manifold (V,w) sat- 
isfying a certain integrality condition, one can construct a line bundle L 
over N along with a connection V on EL which has a curvature of w/h. 
One can then define “prequantum” operators, acting on sections of D, in 
much the same way we did in the Euclidean case in Chap. 22, and these 
operators will have the desired relationship between Poisson brackets and 
commutators. One then chooses a polarization on N and defines the quan- 
tum Hilbert space to be the space of sections that are covariantly constant 
in the directions of that polarization. If the Hamiltonian flow generated by 
a function f preserves the relevant polarization, then Qpre(f) will preserve 
the quantum Hilbert space. In the case of real polarizations, there may fail 
to be any nonzero square-integrable sections that are covariantly constant 
in the directions of the polarization, a possibility that forces us to introduce 
the machinery of “half-forms.” 

Let us end this introduction with a brief critique of the framework of geo- 
metric quantization. In the first place, geometric quantization has too many 
definitions (bundles, connections, curvature, polarizations, half-forms) and 
too few theorems. In the second place, the class of functions that geometric 
quantization allows us to quantize—those functions for which the associ- 
ated Hamiltonian flow preserves the polarization—is often dishearteningly 
small. In the case N = T*M, for example, with the natural “vertical” 
polarization, geometric quantization does not allow us to quantize the ki- 
netic energy function, at least not by the “standard procedure” of geomet- 
ric quantization. Nevertheless, geometric quantization is the only game in 
town if one wants to quantize general symplectic manifolds in a way that 
produces an actual Hilbert space and operators thereon. 
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This chapter lays out in an orderly fashion all the ingredients needed 
to “do” geometric quantization. Furthermore, although this approach in- 
creases length, the chapter fills in the details of several arguments that 
are only sketched in the standard reference on the subject, the book [45] of 
Woodhouse. The presentation assumes basic results about symplectic man- 
ifolds from Chap. 21. Besides the basic results about manifolds reviewed in 
Sect. 21.1, we will make use of the Frobenius theorem (see, e.g., Chap. 19 
of [29]). 

As we have noted already in the introduction to Chap. 22, sign con- 
ventions in the subject of geometric quantization are not consistent from 
author to author. 


23.2 Line Bundles and Connections 


In this section, we develop the necessary machinery to extend the prequan- 
tization construction of Sect. 22.2 to arbitrary symplectic manifolds. We 
introduce the notion of a line bundle over a manifold and sections thereof, 
which look locally like complex-valued functions. We then introduce the 
notion of covariant derivatives of sections of a line bundle, where locally 
these covariant derivatives take the form Vx = X — i0(X) for a certain 
1-form 6. We then introduce the curvature 2-form, which is a globally de- 
fined, closed 2-form that can be computed locally as d@. We continue to 
observe the summation convention, in which repeated indices are always 
summed on. 


Definition 23.1 If X is a smooth manifold, a complex line bundle over 
X is a smooth manifold L together with the following additional structures. 
First, we have a smooth, surjective map 7: L —+ X. Second, for each x € X, 
the set nm! ({x}) is equipped with the structure of a complex vector space of 
dimension 1. For each x € N, the vector space x~‘({x}) is called the fiber 
of L over x. 

These structures are assumed to satisfy the local triviality property, 
namely that each « € X has a neighborhood U such that there exists a 
diffeomorphism x: 7 ~1(U) + U x C with the following properties. First, 


™(p) = 71 (x(p)), 


where 1m, :U x C > U 1s projection onto the first factor. Second, for each 
x €U, the map p++ m2(x(p)) is a vector space isomorphism of m+ ({x}) 
with C. 

A section of a line bundle L over X is a map s : X — L such that 
t(s(p)) =p for all pe X. 


For any manifold X, we can form the trivial line bundle X x C, where 
m(x,z) = z and where the vector space structure on {x} x C is just the 
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usual vector space structure on C. The local triviality property for a general 
line bundle L means that DL “looks” locally like the trivial line bundle. 


Definition 23.2 A connection V on a line bundle L over N is a map 
associating to each vector field X on N and section s of L another sec- 
tion Vx(s) of L satisfying the following properties. First, for each smooth 
function f on N, we have 


V ¢x(s) = fVx(s) (23.1) 


for all vector fields X and sections s. Second, for each smooth function f 
on N, we have the product rule 


Vx(fs) = (X(f))s + FV x(s) (23.2) 
for all vector fields X and sections s. 


Note that for any section s of Z and any function f on N, the quantity 
fsisasection of s. Given a connection V and a vector field X, the operator 
Vx is called the covariant derivative in the direction of X. 


Definition 23.3 A Hermitian structure on a line bundle L over N is 
a choice of an inner product (-,-) on each fiber r~*({x}) of L such that 
for each smooth section s of L, (s,s) is a smooth function on N. A line 
bundle L together with a choice of a Hermitian structure on L will be called 
a Hermitian line bundle. A connection V on a Hermitian line bundle 
L is called Hermitian if for every vector field on X, we have 


(Vx (s1), $2) + (51, Vx(s2)) = X(s1, S2) (23.3) 
for all smooth sections s; and sq of L. 


We will let the expression “Hermitian line bundle with connection” refer 
to a Hermitian line bundle L together with a Hermitian connection on L; 
that is, in this expression, “Hermitian” applies both to the bundle and to 
the connection. 

Given a Hermitian line bundle L with connection, it is always possible 
to choose a locally defined smooth section sg near any point such that 
(So, 80) = 1. We call so a local isometric trivialization of L. Any section 
s of L can be written locally as s = fso9 for a unique complex-valued 
function f. Given a vector field X, let 0(X) be the unique function such 
that 

Vx(s0) = —i6(X)s0. 


Using the assumption Vx = fV x, it can be shown (Exercise 1) that the 
value of 6(X) at a point p depends only on the value of X at p. Thus, 6 
defines a 1-form on N. Using the assumption that V is Hermitian, it can 
be shown (Exercise 2) that 0(X) is always real valued. 
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Now, using the product rule (23.2) for covariant derivatives, we have 


Vx(fso) = X(f)s80 + fV x (so) 
= (X(f) — 10(X) f)s0. 


Thus, if we identify sections of D locally with the coefficient function f, we 
have 


Vx(f) = X(f) — (X)f, (23.4) 


as in Sect. 22.2. We call @ the connection 1-form associated to the particular 
local isometric trivialization. 


Definition 23.4 For any Hermitian line bundle (L,V) with connection, 
define the curvature 2-form w of V by requiring that 


w(X, Y)s = 4 (Vx Vy —VyVx- Vix.y}) (s) 
for all sections s and vector fields X and Y. 


Of course, one should check that the given expression for w is really a 
2-form, meaning that the value of w(X,Y) at a point z depends only on 
the values of X and Y at z, and that it does not depend on the choice of 
section s, provided only that s(z) 4 0. One way to do this is to compute w 
in a local isometric trivialization, as in the following result. (See Exercise 3 
for a different approach.) 


Proposition 23.5 Let so be a local isometric trivialization of L and let 6 
be the associated connection 1-form. Then the curvature 2-form w of V is 
expressed locally as 

w= dé. 


In particular, w is a closed 2-form. 


Proof. The computation is precisely the same as in the proof of Proposition 
22.3 in the Euclidean case. 

A locally defined 1-form @ satisfying dO = w is called a (local) symplectic 
potential for w. Our next result says that every symplectic potential is the 
connection 1-form for some local isometric trivialization of LD. 


Proposition 23.6 Let (L,V) be a Hermitian line bundle with connection 
over N with curvature 2-form w. For each point zo € N and 1-form 6 
defined in a neighborhood U of zo satisfying dO = w, there is a subneigh- 
borhood V CU of zo and a local isometric trivialization of L over V such 
that the connection 1-form of the trivialization is 0. 


Proof. Let so be any isometric trivializing section defined in a neighbor- 
hood of zo and let 7 be the associated connection 1-form. Since d(n—6) = 0, 
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there is a subneighborhood V Cc U of z on which 7 — 6 = df, for some 
smooth function f. If s; = e“/ so, then 


Vx(s1) =iX(fyel so + eV x(s0) 
iX(fe’f so — in(X)e"s so 
= ~i(n(X) — of (X))s1. 


I 


Thus, the connection 1-form associated with the local isometric trivializa- 
tion 5s; isn7—df=0. @ 


Proposition 23.7 If (L1,V') and (L2,V?) are Hermitian line bundles 
with connection over N, let Ly ® Lz denote the line bundle over N for 
which the fiber over x ts Ly, ®L2.2, with the natural inner product induced 
by the inner products on Li, and Lo. Then there is a unique Hermitian 
connection V on Ly ® Ly with the property that 


V x (81 ® 82) = (Vx81) @ 82 + 81 @ (VX53), 


for all vector fields X on N and all smooth sections s, of Ly and s2 of Lz. 
The curvature 2-form w for (Li @ L2,V) is given by 


W=W1+42, 


where w and we are the curvature 2-forms for (Li, V+) and (L2,V7), re- 
spectively. 


The proof of this proposition is a straightforward exercise in “definition 
chasing” and is left as an exercise to the reader. 

Suppose that LD is a Hermitian line bundle over N with connection V 
and curvature 2-form w. Given a loop ¥ : [a,b] + N, we can construct a 
section s of LZ that is defined over y such that the covariant derivative of s 
in the directions along ¥ is zero. Indeed, in a local isometric trivialization, 
such a section can be constructed as 


V(T) 
s(7(T)) = exp if 0(¥(t)) at. (23.5) 


The value of s at the endpoint of the loop will in general not agree with the 
value at the starting point, but will differ by multiplication by a constant 
of absolute value 1. 


Definition 23.8 The holonomy of a loop y : [a,b] > N is the unique 
constant a (of absolute value 1) such that s(y(b)) = as(y(a)), where s is a 
nonzero section defined over y that is covariantly constant in the directions 
of 7. 
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The value of the holonomy of y is easily seen to be independent of the 
value of s at the starting point, provided this starting value is nonzero. 

Suppose that S$ is a compact, oriented surface with boundary in N whose 
boundary OS is a loop. It is not hard to show that the holonomy around 
OS can be computed as 


holonomy(0S) = exp i [ a (23.6) 


Indeed, if S is contained in the domain of a local isometric trivializa- 
tion, then this result follows from (23.5) by means of Stoke’s theorem 
(Sect. 21.1.2). 

Now, if S is a closed (i.e., boundaryless) surface, its boundary is the 
trivial loop, which has a holonomy that is trivial, that is, equal to 1. (Think 
of approximating S by a surface for which the boundary is a very small 
loop.) Thus, for any closed surface S$, (23.6) gives 


exp{i fw} =1, 0S=@. (23.7) 


Equivalently, we have 
1 
s— | weZ. (23.8) 
27m Ig 
The condition (23.8) says that w/(27) is an integral 2-form. Clearly, not 
every closed 2-form satisfies this property. 
The closedness of w (Proposition 23.5) and the condition (23.8) represent 
necessary conditions that the curvature of a Hermitian connection must 
satisfy. It turns out that these two conditions are also sufficient. 


Theorem 23.9 Suppose w is a closed 2-form on a manifold N for which 
w/(2m) is integral in the sense of (23.8). Then there exists a Hermitian 
line bundle L over N with Hermitian connection V such that the curvature 
of V is equal to w. If, in addition, N is simply connected, then (L,V) is 
unique up to equivalence. 


See Sect. 8.3 of [45] for a proof of this result. An equivalence of two 
Hermitian line bundles ZL, and Lz with Hermitian connection over N is a 
diffeomorphism ® : L,; — L such that for each x € N, the restriction of 
® to 7, '({x}) is an isometric linear map onto mz‘ ({a}) and such that for 
each section s of Li, we have 


O(V x(s)) = Vx(®(s)). 


We now have the necessary tools to proceed with the program of geo- 
metric quantization on symplectic manifolds. 
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The first step in the program of geometric quantization for a symplectic 
manifold (N,w) is to construct a Hermitian line bundle LZ over N with 
Hermitian connection for which the curvature 2-form is equal to w/h. The- 
orem 23.9 gives the condition for the existence of such a bundle. 


Definition 23.10 A symplectic manifold (N,w) is quantizable (for a 
particular value of h) if 
1 


—— Z, 


for every closed surface S in N. 


Note that if (N,w) is quantizable for a given value fig of Planck’s con- 
stant, then (NV, w) is also quantizable for h = ho/k for every positive integer 
k. Indeed, according to Proposition 23.7, if Z is a Hermitian line bundle 
with connection having curvature w/ho, then L®* (the tensor product of 
L with itself & times) is a Hermitian line bundle with connection having 
curvature w/(fio/k). 

For the remainder of this chapter, we will assume that N is a quantizable 
symplectic manifold with symplectic form w and that (L,V) is a fixed 
Hermitian line bundle with connection of N with curvature w/h. 

If L is a Hermitian line bundle over a symplectic manifold N, we say 
that a measurable section s of L is square integrable if 


Ist = (f (le s1(@)) 1@) “ 


is finite, where \ is the Liouville volume form on N. Given two square- 
integrable sections s; and s2 of L, we define the inner product of s; and 
89 by 


= [ie meats. (23.9) 


We use parentheses to denote the pointwise inner product (s1(x), s2(x)) 
of two sections s; and s2, which is a function on N, and we use angled 
brackets to denote the global inner product (s1, 82) of the sections, which 
is a number. 


Definition 23.11 The prequantum Hilbert space for N is the space of 
equivalence classes of square-integrable sections of L, where two sections are 
equivalent if they are equal almost everywhere with respect to the Liouville 
volume measure. 


Definition 23.12 If f is a smooth complex-valued function on N, the pre- 
quantum operator Qpre(f) is the unbounded operator on the prequantum 
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Hilbert space given by 


Qpre(S) = inV x; + Ai 
where f represents the operation of multiplying a section by f. 


Proposition 23.13 If f is real-valued, then Qpre(f) is symmetric on the 
space of smooth compactly supported sections of L. 


Proof. Let s; and sz be smooth, compactly supported sections of LD and let 
&/ denote the Hamiltonian flow generated by f. For all sufficiently small 
t, every point in the supports of s; and sz will contained in the domain of 
of . Furthermore, by Liouville’s theorem, the value of 


[levs9) ood 


is independent of t. If we differentiate this relation with respect to ¢ and 
evaluate at t = 0, we obtain, by (23.3), 


0= f (Vx, (s1)o82) + (51, Vxy(52))] 


Thus, Vx, is a skew-symmetric operator on the space of smooth, compactly 
supported sections, from which it follows that Qpre(f) is symmetric. ™ 

By the product rule for covariant derivatives and the identity X/(f) = 
{f, f} = 0, we see that the two terms in the definition of Qpre(f) commute. 
We would then expect the exponential e*#@»re(f) to decompose as a product 
of two exponentials. One of these exponentials is just e”/ and the other 
may be constructed as “parallel transport along the flow generated by X ,.” 
Thus, if the flow generated by Xf is complete, it is possible to use Stone’s 
theorem to construct Qpre(f) as a self-adjoint operator on a domain that 
includes the space of smooth compactly supported sections. 


Proposition 23.14 For any f,g © C~(X), we have 


~[Qnre(F) Qpre(g)| = COonalt J: gh). 


where the equality holds as operators on the space of smooth sections of L. 


Proof. The argument is precisely the same as in Proposition 22.1 in the 
R?” case. mf 

As we have seen already in Sect. 22.3 in the R?” case, the prequantum 
Hilbert space is “too large” to be considered the quantization of N. 
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In the R” case, we have the position, momentum, and holomorphic sub- 
spaces (Definition 22.7), consisting of functions that depend only on x, p, 
or Z, in the sense that the covariant derivatives of functions in the direc- 
tions of p, x, and Z are zero. In each case, the “basic observables” of the 
particular representation (the x,;’s, the p,’s, and the z,’s, respectively) act 
simply as multiplication operators. 

To generalize this to a symplectic manifold N of dimension 2n, we may 
think of choosing n functions a,,...,@, on N that are “independent,” in 
the sense that da;,...,da,, are linearly independent at each point. We as- 
sume that the functions a; Poisson commute ({a;,a,} = 0), which makes 
it reasonable to hope that the quantizations of the a,;’s could act as (com- 
muting) multiplication operators. For each z € N, we let P, be the n- 
dimensional space of directions in which the a,;’s are constant, that is, 
the intersection of the kernels of da;,...,da,,. Since we wish to allow the 
functions a; to be complex valued, P, should be thought of as a subspace 
of the complezified tangent space T[(N). The idea is that our quantum 
Hilbert space should consist of sections of a prequantum line bundle that 
are covariantly constant in the directions of P. 

Now, at each point z, the Hamiltonian vector field X,, will belong to 
P,, because 


daj(Xa,) = Xa, (a;) = {aK, a5} =0. 


Furthermore, since the da,;’s are linearly independent, the Xq,’s are also 
independent, since X., is obtained from da; by an isomorphism of tangent 
and cotangent spaces. Thus, the X.,’s must actually span P, at each point, 
by a dimension count. Since also w(Xq,,Xa,) = —{Qj,0n} = 0, we con- 
clude that w is identically zero on P,. Furthermore, if X and Y are vector 
fields lying in P at each point, we can express them as 


X=4;(z)Xa,, Y =0;(z)Xa,, 
for some smooth functions a; and b;. Then 
[X, Y] = a; (z)Xa, (bk) Xax _ by (2) Xa, (a;)Xo, , 


because [Xq;,Xoa,] = X{a;,a,} = 0- Thus, the commutator of two vector 
fields lying in P will again lie in P. 


Definition 23.15 For any z € N, a subspace P of T,N is said to be 
Lagrangian if dim P = n and w(X,Y) = 0 for all X,Y € P. 


Definition 23.16 A polarization of a symplectic manifold N is a choice 
at each point z € N of a Lagrangian subspace P, C T£(X), satisfying the 
following two conditions. 
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1. If two complex vector fields X and Y lie in Pz at each point z, then 
so does [X,Y]. 


2. The dimension of P,P, is constant. 


The first condition is called integrability, and we have motivated this 
condition in the discussion preceding the definition. The second condition 
is a technical one that prevents problems with certain constructions, such 
as the pairing map. (Although, in practice, one sometimes needs to work 
with “polarizations” in which the second condition is violated, extra care 
is needed in such cases.) 

There is one small inaccuracy in our discussion of polarizations: For 
purely conventional reasons, the quantum Hilbert space is defined as the 
space of sections that are covariantly constant in the direction of P, rather 
than P. Thus, P should really be the complex conjugate of the space of 
directions in which the sections are constant. This convention, however, 
makes no difference to the definition of a polarization, since if P satisfies 
the conditions of Definition 23.16, so does P. 


Example 23.17 If M is any smooth manifold, let N = T*M be the cotan- 
gent bundle of M, equipped with the canonical 2-form w (Example 21.2). 
For each z € T*M, let P, be the complexification of the tangent space 
to the fiber T3M. Then P is a polarization on T*M, called the vertical 
polarization. 


Proof. If {x;} is any local coordinate system on M, let {x;,p;} be the 
associated local coordinate system on T* M. The canonical 2-form is given 
by w = dp; \ dx;. At each point z € T*M, the vertical subspace P, is 
spanned by the vectors 0/Op;. Since w(0/Op;,0/Opr) = 0, we see that P, 
is Lagrangian. Furthermore, P, = P, at every point, and so dim P, M P, 
has the constant value n = dim M. Finally, the integrability of P follows by 
computing the commutator of two vector fields of the form f;(x,p) 0/Op,;, 
which will again be a linear combination of the 0/0p,’s. Integrability also 
follows from the easy direction of the Frobenius theorem, since the fibers 
of T* M are integral submanifolds for P. = 

We may identify two special classes of polarizations, those that are purely 
real (i.c., P; = Pz for all z € N) and those that are purely complex (i.e., 
P,P, = {0} for all z € N). The vertical polarization, for example, is 
purely real. 

If P is purely real, the integrability of P implies, by the Frobenius theo- 
rem, that every point in N is contained in a unique submanifold R that is 
maximal in the class of connected integral submanifolds for P. [An integral 
submanifold R for P is submanifold for which T[(R) = P- for all z € R. 
We will refer to the maximal connected, integral submanifolds of a purely 
real polarization as the leaves of the polarization. 

In general, the leaves may not be embedded submanifolds of N. Suppose, 
for example, that N = S!x $1, with w = dOAd@, where 6 and ¢ are angular 
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coordinates on the two copies of S'. Then the tangent space to N at any 
point may be identified with R? by means of the basis {0/00,0/0¢}. We 
may define a polarization P on N by defining P, to be the span of the 
vector 

) fs) 


a0 de’ 


for some fixed irrational number a. Each leaf of P is then a set of the form 
{ (ciSeit, ciat) € gl x S1]/tER}, 


for some 69, which is an “irrational line” in S' x $1. Each leaf is then 
dense in S' x $1 and, thus, not embedded. We will need to avoid such 
pathological examples if we hope to successfully carry out the program 
of geometric quantization with respect to a real polarization. Much more 
information about the structure of real polarizations may be found in Sects. 
4.5-4.7 of [45]. 

We now consider some elementary results concerning purely complex 
polarizations. 


Proposition 23.18 Suppose P is a purely complex polarization on N. For 
each z € N, let Jz:TON + TEN be the unique linear map such that J, = 
il on P, and J; = —il on P,. Then J, is real (i.e., it maps the real tangent 
space to itself) and w is J,-invariant [t.e., w(J,X 1, J-X2) = w(X1, X2) for 
all X1, Xp € TON]. 


Proof. Since the restriction of J, to P, is the complex-conjugate of its 
restriction to P,, the map J, commutes with complex conjugation and thus 
maps real vectors (those satisfying X = X) to real vectors. Meanwhile, 
since P, is Lagrangian and w is real, P, is also Lagrangian. Given two 
vectors X; = Y; + Z1 and X29 = Y2+ Ze, with Y; € P, and Z; € P., we 
compute that 


w(J,X1, J,X2) 
= w(iYi, tY2) + w(iYi, —iZg) + w(—iZ4, tY2) + w(—iZ1, —iZ2) 
= (Yi, Z2) + w(Z1, Ya). 


A similar calculation gives the same value for w(X1, X2), showing that w 
is J,-invariant. © 

A complex structure on a 2n-dimensional manifold N is a collection of 
“holomorphic” coordinate systems that cover N and such that the transi- 
tion maps between coordinate systems are holomorphic as maps between 
open sets in R2” & C”. At each point z € N, there is a linear map 
J,:T,N —T-.N defined by the expression 


O 0 O O 
(ae) =a te (ar) =a 
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where the x;’s and y,;’s are the real and imaginary parts of holomorphic 
coordinates. This map is independent of the choice of holomorphic coordi- 
nates and satisfies J? = —I. At each point z € N, the complexified tangent 
space T©N can be decomposed into eigenspaces for J, with eigenvalues i 
and —7; these are called the (1,0)- and (0, 1)-tangent spaces, respectively. 

Meanwhile, if N is any 2n-dimensional manifold and J is a smoothly 
varying family of linear maps on each tangent space satisfying J? = —I for 
all z, then J is called an almost-complex structure. Given an almost complex 
structure, we can divide the complexified tangent space into +7 eigenspaces 
for J. The Newlander—Nirenberg theorem asserts that if the family of + 
eigenspaces is integrable (in the sense of Point 1 of Definition 23.16), then 
there exists a unique complex structure on N for which these are the (1, 0)- 
tangent spaces. 

A purely complex polarization P gives rise to a complex structure on N, 
as follows. By Proposition 23.18 and the Newlander—Nirenberg theorem, 
there is a unique complex structure on N for which P, is the (1, 0)-tangent 
space, for all z € N. 

Now, we have already seen in the R?” case that some purely complex 
polarizations behave better than others. [Compare (22.11) to (22.13)]. The 
geometric condition that characterizes the “good” polarizations is the fol- 
lowing. 





Definition 23.19 For any purely complex polarization P, let J be the 
unique almost-complex structure on N such that J, = iI on P, and J, = 
—il on P.. We say that P is a Kahler polarization if the bilinear form 


g(X,Y) :=w(X, JY) (23.10) 
is positive definite for each z € N. 


For any purely complex polarization, the bilinear form g in (23.10) is 
symmetric, as the reader may easily verify using the J,-invariance of w. 

Suppose, for example, that we identify R? with C by the map z = x—iap, 
for some fixed a > 0. If we define a purely complex polarization on R? by 
taking P, to be the span of the vector 0/0z in (22.9), then (Exercise 4), P 
is a Kahler polarization. 


23.5 Quantization Without Half-Forms 


To construct a prequantum Hilbert space, we must choose a line bundle 
(L,V) over (N,w) having curvature w/h. Such a bundle exists if w/h is 
an integral 2-form and is unique (up to equivalence) if N is simply con- 
nected. To pass to the quantum Hilbert space, we must make a substantial 
additional choice, that of a polarization P on N. In our first attempt at 
defining the quantum Hilbert space associated with P, we consider the 
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space of sections of (L,V) that are covariantly constant in the directions 
of P. Although this approach works reasonably well for a purely complex 
polarization, in the case of a purely real polarization, there typically are no 
square-integrable sections satisfying this condition. (Indeed, we have seen 
this problem already in the R?” case, in Sect. 22.4.) In the next section, we 
will introduce half-forms to address this problem. 

In the remainder of the chapter, we will let P denote a fixed polarization 
on N. 


23.5.1 The General Case 


As we have remarked, it is customary to consider sections that are 
covariantly constant in the directions of P rather than in the directions 
of P. 


Definition 23.20 A smooth section s of L is polarized (with respect to 
P) if 
Vxs=0 (23.11) 


for every vector field X lying in P. The quantum Hilbert space associated 
with P is the closure in the prequantum Hilbert space of the space of smooth, 
square-integrable, polarized sections of L. 


As in the Euclidean case, we will simply restrict the prequantum opera- 
tors to the quantum Hilbert space, in those cases where Qpre(f) preserves 
the space of polarized sections. 


Definition 23.21 A smooth, complex-valued function f on N is quanti- 
zable with respect to P if Qpre(f) preserves the space of smooth sections 
that are polarized with respect to P. 


The following definition will provide a natural geometric condition guar- 
anteeing quantizability of a function. 


Definition 23.22 A possibly complex vector field X preserves a polar- 
ization P if for every vector field Y lying in P, the vector field [X,Y] also 
lies in P. 


Note that if X lies in P, then X preserves P, by the integrability assump- 
tion on P. There will typically be, however, many vector fields that do not 
lie in P but nevertheless preserve P. 

If X is a real vector field, then [X,Y] is the same as the Lie derivative 
£Lx(Y). It is then not hard to show that X preserves P if and only if the 
flow generated by X preserves P, that is, if and only if (®;).(P) = Po, 2) 
for all z and t, where ® is the flow of X. Furthermore, if X is real, then X 
preserves P if and only if X preserves P. 
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Example 23.23 If N=T*M for some manifold M and P is the vertical 
polarization on N, then a Hamiltonian vector field Xf preserves P if and 
only if f = fit fo, where fi is constant on each fiber and fz is linear on 
each, fiber. 


Proof. In local coordinates {z;,p;}, a vector field X lying in P has the 
form X = g; 0/Op;. Thus, 


of Oa mm E a) im 
Op; dx,’ >" app Ons Op; 9" Opp 





Xx; x1= | 


This commutator will consist of three “good” terms, which involve only 
p-derivatives, along with the following “bad” term: 


of 2 
ok Op,Op; Ox; 


If 0° f /Op,Op; is 0 for all j and k, then the bad term vanishes and [X;, X] 
again lies in P. Conversely, if we want the bad term to vanish for each 
choice of the coefficient functions g;, we must have 0? f /Op,0p; = 0 for all 
j and k. Thus, for each fixed value of x, f must contain only terms that 
are independent of p and terms that are linear in p. 

We now identify the condition for quantizability of functions. 


Theorem 23.24 For any smooth, complez-valued function f on N, if the 
Hamiltonian vector field X¢ preserves P, then f is quantizable. 


Since we do not assume that f is real-valued, the condition that X 
preserve P is not equivalent to the condition that X f preserve P. 
Proof. Given a polarized section s, we apply Qpre(f) to s and then test 
whether Qpre(f)s is still polarized, by applying Vx for some vector field 
X lying in P. To this end, it is useful to compute the commutator of Vx 
and Qpre(f), as follows: 


[Vx, Qpre(f)] = ih [Vx, Vx,] + (Vx, f] 
=th (Vises = ru, x;)) + X(f) 
= AV x,x;), (23.12) 
where we have used that 
w(X, Xp) = —w(Xp,X) = —df(X) = -X(f), 


by Definition 21.6. Since Xf preserves P, the vector field [X, Xf] again lies 
in P and, thus, 


Vx (Qpre(f)$) = Qpre(f)V x8 + tAV x,x,]5 = 0, 
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for every polarized section s, showing that Qpre(f)s is again polarized. = 

The converse of Theorem 23.24 is false in general. After all, as we will see 
in the following subsections, for a given polarization, there may not be any 
nonzero globally defined polarized sections, in which case, any function is 
quantizable. On the other hand, it can be shown that if Qpre(f) preserves 
the space of locally defined polarized sections, then the Hamiltonian flow 
generated by f must preserve P. This result follows by the same reasoning 
as in the proof of Theorem 23.24, once we know that there are sufficiently 
many locally defined polarized sections. We will establish such an existence 
result for purely real and purely complex polarizations in the following 
subsections; for the general case, see the discussion following Definition 
9.1.1 in [45]. 

A special case of Theorem 23.24 is provided by “polarized functions,” 
that is, functions f for which X(f) = 0 for all vector fields X lying in 
P. For such an f, the action of Qpre(f) on the quantum space is simply 
multiplication by f, as we anticipated in the introductory discussion in 
Sect. 23.4. 


Proposition 23.25 If f is a smooth, complex-valued function on N and 
the derivatives of f in the P directions are zero, then Qpre(f) preserves the 
space P-polarized sections, and the restriction of Qpre(f) to this space is 
simply multiplication by f. 


We have already seen special cases of this result in the R?” case; see the 
discussion following Proposition 22.11. 
Proof. If the derivatives of f in the direction of P are zero, then for X € P, 
we have 


O= X(f) = df(X) =w(X;,X), 


meaning that Xy is in the w-orthogonal complement of P. But since P 
is Lagrangian, this complement is just P. Thus, X ¢ belongs to P and, in 
particular, Xf preserves P, so that f is quantizable, by Theorem 23.24. 
Furthermore, Vx,s = 0 for any P-polarized section s, leaving only the fs 
term in the formula for Qpre(f)s. ™@ 


23.5.2 The Real Case 


In the R?” case, we have already computed the space of polarized sections 
for the vertical polarization in Proposition 22.8. As we observed there, there 
are no nonzero polarized sections that are square integrable over R?”. The 
same difficulty is easily seen to arise for the vertical polarization on any 
cotangent bundle N = T*M. In Sect. 23.6, we will introduce half-forms to 
deal with this failure of square integrability. 

We now examine properties of general real polarizations. We will see that 
polarized sections always exist locally, but not always globally. 
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Proposition 23.26 If P is a purely real polarization on N, then for any 
zo € N, there exist a neighborhood U of zo and a P-polarized section s of 
L defined over U such that s(zo) 4 0. 


Proof. According to the local form of the Frobenius theorem, we can find 
a neighborhood U of zo and a diffeomorphism ® of U with a neighborhood 
V of the origin in R” x R” such that under ®, the polarization P looks like 
the vertical polarization. That is to say, for each z € U, the image of P, 
under ®,.(z) is just the span of the vectors 0/Oy1,...,0/O0Yn, where the y’s 
are the coordinates on the second copy of R”. By shrinking U if necessary, 
we can assume that L can be trivialized over U and that the open set V is 
the product of a ball B, centered at the origin in the first copy of R” with 
a ball Bp centered at the origin in the second copy of R”. 

Let @ be the connection 1-form for an isometric trivialization of L over 
U and let 6 = (®~1)*(@). Since the subspaces P, are Lagrangian, the 
restriction of 6 to the each set of the form {x} x Bg is closed. Since By 
is simply connected, there exists, for each x € By, a function f, on Bo 
such that the restriction of @ to {x} x Bs equals df,. If we assume that 
fx(0) = 0, then fx(y) will be smooth as a function of (x,y), since it is 
obtained simply by integrating 6 from 0 to y in the vertical directions. 

Now, let ¢ be any smooth function on B, with ¢(0) # 0 and define a 
function w on B, x Bo by 


W(x, y) = o(x)e=/h, 


For any “vertical” vector field X (i.e., one where X is a linear combination 
of 0/Oy1,...,0/0Yn with smooth coefficients), we compute that 


Xp = L(X fav = Loe XW = LHX). 
Thus, 
(x = 7X) v = 0, 


from which it follows that the function wb := wo ® represents a polarized 
section on U in the given local trivialization of LD. 

The existence of nonzero global polarized sections for a purely real po- 
larization P is a more delicate question. If the leaves of P are not embed- 
ded, there is little chance of finding global polarized sections. Even if the 
leaves are embedded, there are obstructions. Since the tangent spaces to 
the leaves of P are Lagrangian subspaces, the restriction of L to R has zero 
curvature. There may, nevertheless, be loops in R for which the holonomy 
(Definition 23.8) is nontrivial. After all, if a loop y in R is not the bound- 
ary of a surface S in R, then we cannot apply (23.6) to conclude that the 
holonomy of ¥ is trivial. The collection of holonomies for a leaf R of P can 
be understood as a homomorphism of 7(R) into S'. If there is any loop in 
R with nontrivial holonomy, any polarized section of Z must vanish on R. 
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Definition 23.27 A submanifold R of N is said to be Lagrangian if dim 
R =n and T.R is a Lagrangian subspace of T,N for each z € R. A 
Lagrangian submanifold R of N is said to be Bohr-Sommerfeld (with 
respect to L) if the holonomy in L of every loop in R is trivial. 


We may summarize the preceding discussion as follows. 


Conclusion 23.28 For a purely real polarization P with embedded leaves, 
a polarized section vanishes on every leaf of P that is not Bohr—-Sommerfeld. 


Our next example suggests that when the leaves are compact, the Bohr— 
Sommerfeld leaves typically form a discrete set within the set of all leaves. 


Example 23.29 Let N = S! x R, equipped with the symplectic form w = 
dx/d@¢, where « is the linear coordinate on R and ¢ is the angular coordinate 
on S'. Let L be the trivial line bundle on N, with sections that are identified 
with smooth functions. Let 6 = x dé and define a connection V on L by 
Vx = X — (i/h)O(X), and let P be the purely real polarization of N for 
which the leaves are the sets of the form S! x {x}, for x € R. Then a leaf 
S' x {x} is Bohr-Sommerfeld if and only if x/h is an integer. 
In particular, there are no nonzero, smooth polarized sections of L. 


Proof. If we define a section locally on a given leaf St x {x} as 
s(@) = cei" 9/P 


for some nonzero constant c, then it is easily verified that Va/ags = 0. After 
one trip around the circle, the value of this section will be the starting value 
times e?7**/", Thus, the holonomy around S$! x {z} is trivial if and only if 
x/his an integer. A polarized section, then, would have to vanish on all the 
leaves where «/fi is not an integer. Since such leaves form a dense subset 
of N, any smooth polarized section must be identically zero. ™ 

Even in cases, such as Example 23.29, where there are no smooth po- 
larized sections, one may still consider “distributional” polarized sections 
that are supported on the Bohr-Sommerfeld leaves, as on pp. 251-252 of 
[45]. 


23.5.8 The Complex Case 


In Proposition 22.8, we computed the space of polarized sections for a cer- 
tain positive, translation-invariant polarization on R?”, namely the one for 
which P, is spanned by the vectors 0/0z; in (22.9). The situation here 
is better than that for the vertical polarization, in that there are nonzero 
polarized sections that are square integrable over R?”. Recall, however, 
that if we take our polarization to be spanned by the vectors 0/0Z;, then 
[see (22.13)], then there are no nonzero square-integrable polarized sec- 
tions. This example indicates the importance of the positivity condition in 
Definition 23.19. 
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For our next example, we consider the example of the unit disk D, 
equipped with the unique (up to a constant) symplectic form that is in- 
variant under the group of fractional linear transformations that map D 
onto D. In this case, the quantum Hilbert space can be identified with a 
weighted Bergman space, that is, an L? space of holomorphic functions on 
D with respect to a measure of the form (1 — |z|?)”dax dy. 


Example 23.30 Let N be the unit disk D C R? equipped with the following 
symplectic form: 


w = 4(1—|z|?)~? dx A dy = (1—1?)~?r dr A dé, 


where (r,@) are the usual polar coordinates. Let L be the trivial line bun- 
dle over D with connection Vx = X — (i/h)0, where 0 is the symplectic 
potential for w given by 


Define a complex polarization on D by letting P, = Span(0/0z), where 
z=a-—1y. In that case, holomorphic sections s have the form 


s(z) = F(z)(1—|2|’)/", 


where F is holomorphic. The norm of such a section is computed as 


Is? =4 a IF(2)P (1 — [z[?)2/*-? de dy. 


As in the case of the plane, the seemingly unnatural definition z = x —iy 
is necessary to obtain a Kahler polarization. If we used z = x + zy instead, 
the holomorphic sections would have the form F(z)(1—|z|?)~!/", in which 
case there would be no nonzero, square-integrable holomorphic sections. 
Proof. See Exercise 8. @ 

We now consider general purely complex polarizations. Recall that, by 
Proposition 23.18 and the Newlander—Nirenberg theorem, N has a unique 
complex structure for which P, is the (1,0)-subspace of TON, for all z € N. 
As in the purely real case, there always exist local polarized sections. 


Theorem 23.31 Suppose P is a purely complex polarization on N. Then 
for each z € N, there exists a P-polarized section s of L, defined in a 
neighborhood of zo, such that s(zo) 4 0. 


We defer the proof of Theorem 23.31 until the end of this subsection. 

Suppose s is as in the theorem and s’ is any other locally defined P- 
polarized section. Then s’ = fs for some unique complex-valued function f, 
and by the product rule for covariant derivatives, X(f) = 0 for all X € Py. 
This means that f is holomorphic with respect to the complex structure 
on N for which P is the (1,0)-tangent space. Thus, we have a preferred 
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family of local trivializations of L (the ones given by nonvanishing local 
polarized sections) such that the “ratio” of any two such trivializations is 
a holomorphic function. This means that we have given LE the structure of 
a “holomorphic line bundle” over the complex manifold N in such a way 
that the holomorphic sections of LE are precisely the polarized sections with 
respect to P. 

Arguing as in the proof of Proposition 14.15, it is not hard to show that 
for a purely complex polarization, the space of square-integrable polarized 
sections of L forms a closed subspace of the prequantum Hilbert space. For 
any z € N, if we choose a linear identification of the fiber of L over z with 
C, then the map s +> s(z) is a linear functional on the quantum Hilbert 
space. It is not hard to show, as in the proof of Proposition 14.15, that 
this linear functional is continuous, and can therefore be represented as an 
inner product with a unique element of the quantum Hilbert space. 


Definition 23.32 Let P be a purely complex polarization on N. For each 
z € N, choose a linear identification of the fiber of L over z with C. Then 
the coherent state y, is the unique element of the quantum Hilbert space 
with respect to P such that 


8(z) = (x2, 8) 
for all s. 


Suppose N = R? with a polarization given by P, = Span(0/0z), where 
z = x — tap. If we use the symplectic potential 6 = (p dx — x dp)/2, 
then, as in the proof of Proposition 22.14, the quantum Hilbert space is 
naturally identifiable with the Segal-Bargmann space. In this case, the 
coherent states can be read off from Proposition 14.17. 

It could happen that y, = 0 for some z € N, or even for all z € N, 
depending on the choice of P. Even if x, is nonzero, vy, is only well defined 
up to multiplication by a constant, because we must choose an identification 
of L~+({z}) with C. But if y, 40, the one-dimensional subspace spanned 
by xz is independent of this choice. That is to say, whenever y, # 0, the 
span of x, is a well-defined element of the projective space P(H), where 
H is the quantum Hilbert space. 

Recall, meanwhile, that if (Z,V) is a Hermitian line bundle with con- 
nection having curvature w/h, then for any positive integer n, there is a 
natural Hermitian connection on L®* having curvature kw/h. This means 
that if ZL is a prequantum line bundle with one value fg of Planck’s con- 
stant, then L®* is a prequantum line bundle with Planck’s constant equal 
to hio/k. The following result shows that in the case of compact symplectic 
manifolds with Kahler polarizations, things behave nicely when k tends to 
infinity. 


Theorem 23.33 Assume N is compact and let P be a Kahler polarization 
on N. For each positive integer k, let Hy, denote the space of polarized 
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sections of L®*. Then for all k, Hy, is finite dimensional. Furthermore, for 
all sufficiently large k, we have the following results. First, the coherent 
state x, © Hy is nonzero for each z € N. Second, the map 


z+ Span(xz) 
is an antiholomorphic embedding of N into P(Hx). 


The finite dimensionality of H;, is a standard result in the theory of com- 
pact, complex manifolds. The embedding of N into P(H;,) is the Kodaira 
embedding theorem, which we will not prove here. The Kodaira embedding 
theorem implies, in particular, that there exist nonzero, globally defined 
polarized sections of L®", at least for large k. Since the value of Planck’s 
constant for L®* is ho/k, Planck’s constant tends to zero as k tends to 
infinity. Thus, the study of holomorphic sections of L®* for large k can be 
understood as being part of semiclassical analysis. 

We now turn to the proof of Theorem 23.31, in which we will make 
use of basic properties of complex-valued differential forms on complex 
manifolds. (“Complex-valued” means that we allow the value of a k-form on 
a collection of k tangent vectors to be a complex number.) In a holomorphic 
local coordinate system z1,...,2n, each form can be written as a wedge 
product of the dz,;’s and dz,’s. A form is called a (p,q)-form if it is a 
linear combination of wedge products of p factors involving the dz,’s and 
q factors involving the dz;’s. Each form can be decomposed uniquely as a 
linear combination of (p,q)-forms for various values of p and q, and this 
decomposition does not depend on the choice of holomorphic coordinate 
system. If a is a (p,q)-form, then da will be a linear combination of a 
(p + 1,q)-form and a (p,q + 1)-form. We define operators 0 and 0 in such 
a way that 0 maps (p, q)-forms to (p + 1,q)-forms, 0 maps (p,q)-forms to 
(p,q +1) forms, and d= 0 +0. In particular, 


O(f dzj, A+++ A dz;, \ 2m, A+++ A dzk,) 


0 
— ) FF aaa N dag, No diy NBR, Ao A ry 
1 
l 


and similarly for 0 with (Of /0z) dz replaced by (Of /0Z)) dz. 
The maps O and 0 satisfy the identities: 
0d = 00 =0 
00 = —00. 
The Dolbeault lemma states that if a (p,q)-form a satisfies da = 0, then a 
can be expressed locally as 08 for some (p—1, q)-form, and if 0a = 0, then 
a@ can be expressed locally as 08 for some (p,q — 1)-form. A (p,0)-form a 


is said to be holomorphic if it can be expressed in holomorphic coordinates 
as a sum of terms of the form 


F(z) dzj, N+ Nd2y,, 
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where the coefficient functions f is holomorphic. A (p,0)-form a is holomor- 
phic if and only if da = 0. If a holomorphic (p,0)-form a satisfies da = 0 
(or, equivalently, da = 0), then a can be written locally as a = d@, for 
some holomorphic (p — 1, 0)-form. 

Let P be a purely complex polarization on N and let J be the almost- 
complex structure for which P, is the (1,0)-tangent space at z. Since 
(Proposition 23.18), w is J-invariant, it follows (Exercise 6) that w is a 
(1, 1)-form. 


Lemma 23.34 Let N be a complex manifold with almost-complex struc- 
ture J and let w be a closed, J-invariant, real-valued (1,1)-form on N. Then 
for every point zo € N, there exists a smooth, real-valued function Kt defined 
in a neighborhood of zo such that i00K = w. 


In the case that N is Kahler [i.e., the case where w(X,JX) > Oj, a 
function « as in the lemma is called a (local) Kahler potential for N. 
Proof. By assumption, dw = (0 + 0)w = 0, from which it follows that 
dw = Ow = 0, because Ow is a (2, 1)-form and Ow is a (1,2) form. Thus, by 
the Dolbeault lemma, there exists a (1, 0)-form a, defined in a neighborhood 
of zo, such that Oa = w. Then 0a is a (2,0)-form that satisfies 





00a = —00a = —Ow = 0. 





This shows that Oa is actually a holomorphic (2,0)-form. 

Since also 00a = 0, we see that Oa is closed, which means that there 
exists a holomorphic 1-form 7, defined in a possibly smaller neighborhood 
of zo, such that dy = On = Oa. Thus, 0(a—1) = 0, and so by the Dolbeault 
lemma, there exists a function g, defined in a neighborhood of zo, such that 
Og =a—yn. Thus, a= 7+ 0g and so 


w = 0a = 00g = —00q 


since On = 0. The function « := ig then satisfies 100K = w. 

Now, a calculation in coordinates (Exercise 7) shows that the map Kk > 
i0Of is real, that is, it maps real-valued functions to real-valued 2-forms. 
Since w is real, the operator i00 must map the imaginary part of « to zero. 
Thus, i00k is unchanged if « is replaced by its real part. m 
Proof of Theorem 23.31. Let « be as in Lemma 23.34 and let @ be the 
real-valued 1-form given by 


6 = Im(0k) = = (On — Ok) . (23.13) 


Then because 0? = 0? = 0, we have 





_ 1 = _ 
do = (0 +) = = (00K — OOK) = w. 
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That is to say, 6 is a symplectic potential for w. Thus, by Proposition 23.6, 
we can find a local isometric trivialization s9 of L for which the connection 
1-form is 0/h. 

For any vector X, we have 


Ve (e-*/2%) 59 = (- eS ne 50%) enh Mgy, (23.14) 


where X(K) = dK(X) = O«(X) + Ox(X). Now, if X is of type (0,1), then 
O«(X) = 0, in which case, if we use (23.13), we find that the two terms on 
the right-hand side of (23.14) cancel. Thus, e~*/@”) sq is the desired local 
polarized section. m 


23.6 Quantization with Half-Forms: The Real Case 


In this section, we introduce a concept known as half-forms, which are 
designed to work around the problem that, in the case of real polarizations, 
there often do not exist any nonzero square-integrable polarized sections. 

A polarized section s for a real polarization P tends to have infinite 
norm, because we may get infinity from integrating |s|” along the leaves of 
the polarization. To illustrate how half-forms work around this problem, 
consider the case of the vertical polarization on R? ~ T*R. Elements of the 
half-form Hilbert space will be representable in the form s @ dx, where s 
is a polarized section of L and where Vdz will be interpreted as a “section 
of the square root of the canonical bundle.” To compute the norm of such 
an object, we first square it at each point to obtain the quantity |s| dx. 
Since s is polarized, |s|? is a function of x only, independent of p. Thus, 
|s|”> dx may be thought of as a 1-form on R, rather than on R2, which we 
may then integrate to obtain 


Is||? = [ ls? (a) de. 


This procedure has two advantages over the one we used in Sect. 22.4, 
where we simply integrated |s|? itself over R. First, a version of this proce- 
dure works for real polarizations on general symplectic manifolds. Second, 
the half-form approach will allow quantized observables to be self-adjoint, 
which was not the case in Sect. 22.5 when we simply restricted prequan- 
tized observables to the polarized subspace. (See the discussion following 
Proposition 22.12.) 

Throughout this section, we assume that N is a quantizable symplectic 
manifold, that LD is a fixed prequantum line bundle over N, and that P is 
a fixed purely real polarization on N. 
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23.6.1 The Space of Leaves 


Recall that a leaf of P is a maximal connected, integral submanifold of 
P. We may then form the leaf space = (the set of all leaves of P) and a 
quotient map gq: N — & sending each point z € N to the unique leaf 
containing z. We may topologize = by defining a set U in & to be open if 
q 1(U) is open in N. 

In order to be able to carry out the program of geometric quantization 
with respect to P, we must assume that = can be given the structure 
of a smooth, n-dimensional manifold in such a way that q: N > & is 
smooth and such that the kernel of g,,. is equal to P®, the intersection of 
P, with the real tangent space of P,. We abbreviate this assumption on 
= by saying that = is a smooth manifold. In the case N = T*M with the 
vertical polarization (Example 23.17), the leaf space = is a smooth manifold 
diffeomorphic to M. 

It should be emphasized that even if = is a smooth manifold, there is no 
canonical “volume measure” on &. Thus, our half-form Hilbert space will 
be defined in such a way that the pointwise “square” of an element will 
be an n-form, rather than a function, on the leaf space, which can then be 
integrated over the n-manifold =. 


23.6.2 The Canonical Bundle 


We now introduce the canonical bundle of a purely real polarization P, 
with sections that are a special sort of n-form on N, along with a notion 
of polarized section of the canonical bundle. If the leaf space = is a smooth 
manifold, the space of polarized sections of the canonical bundle can be 
identified with the space of all n-forms on the n-manifold =. 


Definition 23.35 The canonical bundle Kp of P is the real line bundle 
with sections that are n-forms a having the property that 


X10 = 0 (23.15) 
for every vector field X lying in P. A section a of Kp is polarized if 
X i(da) = 0 (23.16) 
for every vector field X lying in P. 


If an n-form a satisfies (23.15), then a(X1,...,Xn) = 0 if any of the 
X;’s belongs to P. Thus, the value of a at any point z can be viewed as 
an n-linear, alternating functional on the quotient vector space T,N/P®, 
where P® is the intersection of P, with the real tangent space. Since this 
quotient space is n-dimensional, we see that at each point, the space of 
possible values for a is one dimensional. 
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Meanwhile, if a satisfies (23.16), then at each point, da is an (n + 1)- 
linear, alternating functional on T,N/P®, which must be zero. Thus, for 
sections of Kp, (23.16) is equivalent to the condition 


da = 0. (23.17) 


We can also introduce the complezified canonical bundle Ros the sections 
of which are complex-valued n-forms satisfying (23.15). We define a section 
of K§ to be polarized if it satisfies (23.16). 


Example 23.36 Let N = T*R"~ R?” and let P be the vertical polariza- 
tion on N. Then an n-form a on R?” is a section of Kp if and only if a 
is of the form 

a= f(x,p) dv A---Adt&n, (23.18) 


and a is a polarized section of Kp if and only if a is of the form 
a = g(x) da, A---A dan, (23.19) 
for smooth functions f on R?” and g on R". 


Proof. If a contained any term involving dp;, the contraction of a with 
0/Op; would not be zero, leaving (23.18) as the only possible form for a 
section of Kp. Assuming a is of the form (23.18), if f is not independent 
of p, then da will contain a nonzero term of the form dp; \ dx, A---Adzn, 
leaving (23.19) as the only possible form for a polarized section of Kp. 

In Example 23.36, the polarized sections of Kp are effectively just n- 
forms on the configuration space R”. This conclusion is a special case of 
the following result. 


Proposition 23.37 If the leaf space = of P is a smooth manifold and a 
is a polarized section of Kp, then there exists a unique n-form @ on = such 
that 

a=q"(a), 
where q: N + & is the quotient map. Conversely, if 8 is any n-form on =, 
then a := q*(8) is a polarized section of Kp. 


Proof. Suppose, first, that a@ = g*(8), for an n-form 6 on &. Then X sa = 0 
whenever X lies in P, since P is the kernel of q.. Furthermore, da = 
q‘(dB) = 0, since 8 is an n-form on an n-manifold, showing that a is a 
polarized section of Kp. 

In the other direction, we have already noted in the proof of Proposition 
23.26 that N can be identified locally with a neighborhood U x V of the 
origin R” x R” in such a way that leaves of P correspond to the sets of the 
form {x} x V. We can use q to identify U & U x {0} with an open set U 
in =. Thus, P looks locally just like the vertical polarization on R?”, and 
so, by Example 23.36, any polarized section a of Kp will be of the form 
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(23.19). Thus, a determines an n-form @ on U and a is the pullback of 
a@ by the projection map of U x V onto U. It follows that a is locally the 
pullback by q of an n-form @ on U. We leave it to the reader to check that 
overlapping neighborhoods in N give the same form @ on = and that the 
desired result holds globally. m 

Recall from Theorem 23.24 that Qpre(f) preserves the space of polarized 
sections with respect to P, provided that the flow of X+ preserves P (which 
equals P, in this case). We now establish that for any such f, the Lie 
derivative Cx, preserves the space of polarized sections of Kp. This result 
will eventually allow us to define a quantum operator Q(f) on the half-form 
Hilbert space associated to P. 


Proposition 23.38 Suppose X is a vector field on N that preserves P, 
in the sense of Definition 23.22, and suppose a is a smooth section of Kp. 
Then the Lie derivative Lxa is another section of Kp and if a is polarized, 
Lxa is also polarized. 


Proof. Suppose Xj,...,X, are smooth vector fields, with X, lying in 
P= P. Then, by a standard formula for the Lie derivative, 


(Deal Xion a) 
K(X ypc ey Xe) ee a Ryser Ky) 


ACG sng Mpa My | Mpg cng Me (23.20) 

j=2 
Now, because a is a section of Kp, the first and third terms on the right- 
hand side of (23.20) vanish. Because X preserves P, [X, X1] will again lie 
in P, and so the second term vanishes as well. Thus, X11(£xa) = 0, which 


means that £xa is again a section of Kp. 
Since Lxa = X ida + d(X ia), if a satisfies (23.17), we have 


d(£Lxa) = d?(Xsa) = 0, 
showing that a is again polarized. m 
Proposition 23.39 Suppose the leaf space © of P is a smooth manifold 
and that a vector field X on N preserves P. Then there exists a unique 
vector field Y on & such that 


EASY (23.21) 


for all z © N. Furthermore, if a = q*(G) ts a polarized section of Kp, as 
in Proposition 23.37, then 


L£x(q°(8)) =a (Ly (8). (23.22) 
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That is to say, under the identification in Proposition 23.37 of polarized 

sections of Kp with n-forms on &, the operator £y corresponds to the Lie 
derivative on = in the direction of Y. 
Proof. By Definition 23.22, [X, Z] lies in P whenever the vector field Z 
lies in P. Thus, if a function ¢ is constant along P (i.e., annihilated by 
every vector field Z lying in P), the same will be true of X¢. Thus, if ¢ is 
of the form ¢ = w oq for some function w on =, then X¢@ is of the form 
yo q for some other function w on ©. The map wy o is easily seen to be a 
vector field, that is, a derivation of C°°(=). We conclude, then, that there 
is a unique vector field Y on = such that 


X(poqg) =(YvY)og (23.23) 


for every smooth function w on =. It then follows from the definition of the 
differential that (23.21) holds for all z € N. From (23.21), it follows easily 
that for any n-form 6 on =, we have 


X4(q"(8)) = 4"(¥28). (23.24) 


Since 8, being a top-degree form, is closed, g*() is also closed. Thus, one 
of the terms in the formula (21.7) for the Lie derivative of 6 and q*({) is 
zero. Applying d to both sides of (23.24) then gives (23.22). m= 

Given a vector field Y and a nowhere-vanishing n-form 3 on &, let divg Y 
be the unique function on = such that 


Ly (8) = (divs Y) 6. 
Then by (23.22), we have 


L£x(q"(B)) = (dive Y) © q)q* (6). (23.25) 


The expression (23.25) will be helpful in analyzing the quantization of 
observables in Sect. 23.6.5. 


23.6.3 Square Roots of the Canonical Bundle 


We now assume that the leaf space = of P is an orientable manifold, and 
we choose on particular orientation of =. 


Definition 23.40 Choose a nowhere-vanishing, oriented n-form 6 on &, 
so that a := q*(8) is (Proposition 28.387) a nowhere-vanishing section of 
Kp. A section of Kp is non-negative if it is, at each point, a non-negative 
multiple of a. This notion does not depend on the choice of oriented n-form 


B. 


Since = is orientable, the canonical bundle K p is trivializable, since the 
section @ in Definition 23.40 is a globally trivializing section. Thus, we can 
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find a square root of Kp, that is, a line bundle dp such that dp @ dp is 
isomorphic to Kp. (We may, for example, take dp to be the trivial bundle.) 
When we speak of a square root of Kp, we will mean, more precisely, a 
bundle 6p together with a particular isomorphism of dp ® dp with Kp. 
Thus, if s; and s2 are sections of dp, we think of s1 ® sg as being a section 
of Kp. We assume, further, that the isomorphism of dp ® dp with Kp is 
chosen so that for any section s of dp, the section s ® s of Kp is non- 
negative. (If the initial isomorphism of dp ® dp with Kp does not have this 
property, compose it with —J in the fibers of Kp.) 

We may consider the complezification of dp, that is, the line bundle 5 
whose fiber at each point is the complexification of the fiber of dp. There 
is then a notion of complex conjugation for sections of 5 which fixes the 
fiber of dp inside the fiber of 05 at each point. If s; and s2 are sections of 
og, we think of s; ® sg as a section of the complexified canonical bundle 
KS, 

If a is a section of Kp and X is a vector field lying in P, let us define an 
n-form V xa by 


Vxa = X (da). (23.26) 


Since a is a section of Kp, we have Xia = 0, which means that Vxa 
actually coincides with Ly a, by (21.7). Since it lies in P, the vector field 
X preserves P, and thus Vxa = £xa is again a section of Kp, by Proposi- 
tion 23.38. The operator V in (23.26) has all the properties of a connection 
on Kp except that it is only defined in the directions of P. [Note that Lx 
does not, in general, satisfy the condition Lrx = fLx, as required by Def- 
inition 23.2. Since, however, £xa can also be computed as in (23.26), for 
any section a of Kp, the map V does satisfy Vpx = fVx.] 

We call V the natural partial connection on Kp. According to Defini- 
tion 23.35, a section a of Kp is polarized if and only if Vxa = 0 for each 
vector field X lying in P. We now show that both the partial connection 
and the Lie derivative “descend” to sections of dp in a natural way. This 
result will, in particular, allow us to define a notion of polarized sections 
of Op. 


Proposition 23.41 Let dp be a fixed square root of Kp. For any vector 
field X lying in P, there is a unique linear operator Vx mapping sections 
of dp to sections of dp, such that 


Vx(fs1) = X(f)si + fVxs1 (23.27) 
Vx(s1 ® 82) = (Vx81) ® 82 + 81 ® (Vxs2) (23.28) 


for all smooth functions f and all sections s1 and s2 of dp. On the left-hand 
side of (23.28), Vx is the partial connection on Kp given by (23.26). 
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If X is a vector field on N that preserves P, then there is a unique linear 
operator Lx, mapping sections of dp to sections of dp such that 


Lx(fsi) = X(f)si t+ fLxsi 
Lx(s1 ® 82) = (Lx51) ® 82+ 81 @ (Lx 582) 


for all smooth functions f and all sections s, and s2 of dp. 
Both of these constructions extend naturally from sections of dp to sec- 
tions of 55. 


We may then say that a section s of 5§ is polarized if Vxs = 0 for every 
smooth vector field X lying in P. 
Proof. If V is a one-dimensional vector space, then the map ® : V x V > 
V®V is commutative: u®v = v®u for all u,v € V. Furthermore, if uo is a 
nonzero element of V, then the map ut} u® ug is an invertible linear map 
of V to V ® V. Suppose so is a local nonvanishing section of dp. Applying 
(23.28) with 5s; = s3 = 59, we want 


2(V x80) ® 80 = Vx(S0 ® So). (23.29) 


Since the operation of tensoring with sp is invertible, there is a unique 
section “Vx 59” of dp for which (23.29) holds. 

Locally, any section s of dp can be written as s = gsq for a unique 
function g. We then define Vx s by 


Vxs = X(g)s0 + 9V x50, (23.30) 


in which case, (23.27) is easily seen to hold. If s; = gis9 and s2 = g250, 
then using (23.29) and the symmetry of the tensor product, it is easy to 
verify that (23.28) holds, with both sides of the equation equal to 


X(9192)V x (80 @ 80). 


Uniqueness of Vx holds because both (23.29) and (23.30) are required 
by the definition of Vx. The action of Vx extends to sections of ee by 
writing such sections as complex-valued functions times sg. The analysis of 
the Lie derivative is similar and is omitted. m 


23.6.4. The Half-Form Hilbert Space 


We continue to assume that the leaf space = of P is an orientable manifold, 
and that we have chosen an orientation on =. We assume that we have 
chosen a square root dp of Kp, as in Sect. 23.6.3. If L is a prequantum line 
bundle over N, we now form the tensor product bundle L ® oe Given two 
sections s; and s2 of L @ 6S, we decompose them locally as s; = [lj @ vj, 
where ju; is a section of L and v; is a section of 6S, and where, say, the 
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jt;’8 are taken to be nonvanishing. Then we can combine these sections to 
form the quantity 


(81, 82) = (M1, 2) 71 ® va, (23.31) 


where (11, 42) is the pointwise inner product given by the Hermitian struc- 
ture on L. Since (1, /42) is a scalar-valued function and 7 @ v2 is a section 
of KS, the quantity (si, s2) is a section of KE. Any other decomposition 
of s; as the tensor product of a nonvanishing section of a L and a section 
of dp is of the form (fj) ® (v;/f) for some nonvanishing function f, and 
the value of (51,2) is the same as for the original decomposition. Since 
it is independent of the choice of local decomposition, (s1, 52) is actually 
defined globally. 

Given the connection on L and the partial connection (23.41) on 65, we 
can form a partial connection on L @ 6§ with the following property. For 
any vector field X lying in P, and any section s of L @ 6§, if we decompose 
s locally as s = 44 ® v, where p is a nonvanishing section of L and v is a 
section of dp, then 


Vx(s) = (Vx) @v+p® (Vxv). (23.32) 


The reader may verify that if 1 ® v is replaced by (fj) ® (v/f) for some 
nonvanishing function f, the value of Vx(s) is unchanged. Thus, as with 
the quantity (s1, 52) in (23.31), V.x(s) is defined globally. We then define 
a section s of L ® oe to be polarized if Vxs = 0 for each vector field X 
lying in P. If s; and s9 are polarized sections of L @ 65, then the section 
(s1, 82) in (23.31) is easily seen to be a polarized section of Kp. 

As in the case without half-forms there is an obstruction to the existence 
of globally defined polarized sections of L@6$. We say that a leaf R is Bohr- 
Sommerfeld (in the half-form sense, with respect to a particular choice of 
Op) if there exists a nonzero section s of L @ 55 defined over R such that 
Vxs =0 for each tangent vector to R. As in the case without half-forms, 
if the leaves are topologically nontrivial, the Bohr-Sommerfeld leaves will 
in general be a discrete set in the space of all leaves. 

The Bohr—Sommerfeld leaves in the half-form sense need not be the same 
as the Bohr-Sommerfeld leaves in the sense of Definition 23.27. In the 
setting of Example 23.29, for instance, the canonical bundle Kp is trivial, 
but the square-root bundle dp may be chosen to be nontrivial, by putting 
in a twist by 180 degrees over each copy of St. (That is to say, we think 
of $1 as the interval [0,27] with the ends identified, and we attach a copy 
of R to each point. But when identifying the fiber at 27 with the fiber at 
0, we use the negative of the identity map.) As Exercise 9 shows, in this 
example, the Bohr-Sommerfeld leaves are the sets of the form {x} x S$, 
where «/h =n + 1/2 for some integer n. 


Definition 23.42 For any purely real polarization P and any square root 
dp of Kp, the half-form space is the space of smooth, polarized sections 
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of L® Oo For a polarized section s of L® hee define the norm of s by 
Isl? = [ @), (23.33) 


where (s,s) is as in (23.31) and where (s,s) is the n-form on = given by 
Proposition 23.87. If 8,1 and sg are elements of the half-form space with 
\|si|| < co and ||s2|| < 00, define the inner product of 81 and sz by 


(s1,50) = [| (ass). 


The half-form Hilbert space is the completion with respect to the norm 
(23.33) of the space of polarized sections s for which |\s||” <0. 


The integral of n-forms on © is taken with respect to the chosen orien- 
tation on &. We can always decompose s locally as s = uw @v with v being 
a section of dp (as opposed to df) and py being a section of L. Then 


(s,s) = (u,p)V@v, 


from which we see that (s,s) is a non-negative section of Kp (Defini- 
tion 23.40). (Recall that we have chosen the identification of dp ® dp with 
Kp in a particular way, so that v ® v is always the pullback by q of an 
oriented form on =.) Thus, the integral on the right-hand side of (23.33) is 
non-negative, but possibly infinite. 


Example 23.43 Let N = T*R = R? and let L be the trivial bundle on 
N, with connection Vx = X — (i/h)0(X), where 0 = p dx. Let P be the 
vertical polarization on N and orient R so that oriented 1-forms are positive 
multiples of dx. Let dp to be the trivial bundle and with a trivializing section 
“/dax” of 6p such that Jdz ® Vdx = dx. Then every polarized section s of 
L@ 65 has the form 


s=~(x) @ Vda (23.34) 


for some function wW on R. The norm of such a section is computed as 


Is||? = A Ww(a)[? der 


Proof. The sections of Kp are 1-forms that are zero on 0/Op, that is, 
1-forms of the form a = f(a,p) dx. Such a 1-form satisfies da = 0 if 
and only if f is independent of p. Thus, dz is a globally defined polarized 
section of Kp. If we choose dp to be trivial and let W/dz be such that 
Vdx® Vdx = dx, then dx will be a polarized section of dp. Every section 
s of L@6§ can be written uniquely as s = U(x, p) @ Vdzx for some function 
w. Since Vda is polarized and 6(9/0p) = 0, we see that s is polarized if 
and only if w is independent of p. For a section of the form (23.34), we have 
(s,s) = |v(x)|? da, in which case, (s,s) is given by the same formula as 
(s,s), but now interpreted as a 1-form on = & R rather than R?. m 
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23.6.5 Quantization of Observables 


Suppose f is a function on N for which Xy preserves P in the sense of 
Definition 23.22. We will now associate with f a self-adjoint (or, at least, 
symmetric) operator Q(f) on the half-form Hilbert space of P. Operators 
of this sort will satisfy exactly the desired commutation relations. 


Definition 23.44 For any function f on N for which Xf preserves P, let 
Q(f) be the operator on the half-form space of P given by 


Q(f)s = (Qpre(f) 4) @Qu+ih U®@Lx,Y, 


where s is decomposed locally as s = u@v, with tu being a section of L and 
v a section of 5%. 


The operator Q(f) is well defined (i.e., independent of the choice of local 
trivialization) as may easily be verified. This independence holds, however, 
only because the coefficient ih of Vx, in the first term exactly matches the 
coefficient ih of Lx, in the second term. 

Before describing the general properties of the operators Q(f), we con- 
sider a simple example that illustrates the essential role of the Lie derivative 
term in Definition 23.44. 


Example 23.45 Let the notation be as in Example 23.43, and let f : R? > 
R be of the form 
f(x, p) = a(x) + 0(@)p, 


or some smooth functions a and b on R. Then Xy¢ preserves P and 
f fi fP 
Q(f)(d(2) ® Vdz) = (2) ® Ve, 


where 


Bla) = —ih (o(w)u! (x) + 50 (x)v(a) ) + a(2)V(2). 
2 


In particular, if f(v,p) = a, then w(a) = «u(x) and if f(a,p) = p, then 
w(a) = —ih 0y/Ox. More generally, if a and b are polynomials, then the 
action of Q(f) on ~ coincides with the Weyl quantization of f (Exercise 8 
in Chap. 13). 

The term involving b’(%) comes from the presence of half-forms and is 
absent in the formula (22.15) for Qpre(f). The 6’ term, with the exact 
coefficient of 1/2, is necessary for Q(f) to be self-adjoint (or, at least, 
symmetric); see Exercise 10. Example 23.45 is actually quite representative 
of the general case. [Compare (23.38) in the proof of Theorem 23.47 and 
Example 23.48.| 
Proof. We have computed Qpre(f) in (22.15) in the proof of Proposi- 
tion 22.12. We compute that Xy is equal to —b(x) 0/Ox plus a term in- 
volving 0/Op. Since the 1-form dz is closed, we obtain, by (21.7), 


Lx, (dx) = d(Xz.dz) = —db(x) = —b' (ax) dz. 
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Using Proposition 23.41, we then obtain 
1 1 — 
ieee (var) @ Vde =—50"(2) dx =-50'(0)Vde @ Vde, (23.35) 


which gives 
ae 1 — 
Lx, (vac) =— 5b (a)V de. 


Adding the Lx, term to the previously computed expression for Qpre(f) 
gives the desired result. 

Returning now to the setting of general real polarizations, we establish 
two key results for the quantized observables Q(f), that they satisfy the 
desired commutation relations and that they are self-adjoint (or, at least, 
symmetric) whenever f is real valued. It can also be shown that when f is 
a polarized function (i.e., constant along each leaf of P), then Q(f) acts on 
the quantum Hilbert space simply as multiplication by f. See Exercise 11. 


Theorem 23.46 Suppose f and g are functions on N for which X;+ and 
X, preserve P. Then the operators Q(f) and Q(g) satisfy 

1 

7 OY), Q(9)] = CF 9}) 


on the space of smooth, polarized sections of L ® Op. 


Proof. Since Q(h) is a local operator for any function h, it suffices to prove 
the result locally. Let us choose, then, a local nonvanishing section vo of 
65, so that, locally, each section s of L@6$ can be decomposed uniquely as 
$ = {£® vp. For any vector field preserving P, we let y(X) be the function 
such that 

Lx (vo) = 7(X)v 


We then have Q(f)(u @ vo) = & ® up, where 
ft = [Qpre(f) + thy(Xp)|u. 
We now compute that 
[Qpra(t) + ihy (Xf), Qpre(g ) + ihy(Xq )I 
= [Qpre(f); Qpre(9)] + th[Qpre(F), 1X g)] + thly(Xq); Qpre(F)] 
= ihQpre({f,g}) + (ih)? (Xp(Y(Xq)) — Xg((Xp)))- 


The desired result will follow if we can verify that 


X4(V(Xq)) — Xa(V(XF)) = XG 6.94): (23.36) 


To verify (23.36), we use a standard identity for the Lie derivative on 
forms: Lyx, y) = [£x, Ly]. Using Proposition 23.41, we can easily show that 
this erie holds also on sections of ee for vector fields that preserve P. 
It is then a simple calculation (Exercise 12) to verify (23.36). ™ 
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Theorem 23.47 If f © C™°(N) is real valued and Xj preserves P, then 
the operator Q(f) is symmetric on the space of smooth sections s in the 


half-form space for which (s,s) has compact support on =. 


Proof. Suppose a = q*(() is polarized section of Kp, so that there is, 
at least locally, a corresponding polarized section \/q*(8) of dp. If Xy 
preserves P, then by Proposition 23.39, there is a unique vector field Yr on = 
such that q..,.(Xy,) = Y> for all z € N. Using (23.25) and Proposition 23.41, 
we get 


£x, (VFB) = 5((divs ¥p) oa VFB): 


Meanwhile, it is not hard to show (Exercise 13) that it is possible to 
choose a local symplectic potential @ that is zero in the directions of P. 
Thus, we can trivialize L locally in such a way that sections that are co- 
variantly constant along P are simply functions that are constant along P 
in the ordinary sense. Thus, elements s of the half-form space have, locally, 


the form 
s=(~oq)® Vag*(8) (23.37) 


for some function 7 and n-form 8 on &. Thus, if Xf preserves P, and a 
section s is decomposed locally as in (23.37), we have 


Q(f)(s) = (bog) ® Va"*(A), 
where 


w=ih (x + 5 (divs v/)8) + (-0(X +) — f)y. (23.38) 


It can be verified (Exercise 14) that the function —A(X;) — f is constant 
along P and thus may be thought of as a function on =. 

By multiplying elements of the half-form space by functions of the form 
yxog, with x having compact support in =, we can “localize” the calculations 
on =. Suppose s; and s2 are two elements of the half-form space decomposed 
as in (23.37) near a point z € N, with the same @ and two different functions 


wy and wz on =. Then (81, s2) has the form #1728 in a neighborhood U of 


q(z). By localization, we may assume that (s1, 2) has compact support in 
U, and we then have 


(s1, Q(f)s2) = —ih | dive B, 


—— 


where @y is as in (23.38). “Integration by parts” (Exercise 15) with respect 
to 6 then shows that this quantity coincides with (Q(f)s1, 2). ™ 


Example 23.48 (Cotangent Bundles) Let N = T*M for an oriented 
manifold M, let @ be the canonical 1-form on N, and let L be the trivial 
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line bundle on N, with connection Vx = X — (i/h)O(X). Let P be the 
vertical polarization on N, so that Kp is trivial, and let dp be chosen to 
be trivial. Let 8 be an arbitrary nowhere-vanishing, oriented n-form on M, 
so that a := n*(8) is a nowhere-vanishing section of Kp, and choose a 
trivializing section \/a of dp with /a®/a =a. In that case, elements s 
of the half-form Hilbert space have the form s = (1) 07) @ \/a, where w is 


a function on M, and 
2 2 
isi? = f whe. 
M 


The half-form Hilbert space may, thus, be identified with L?(M, £). 

Suppose now that f is a function on T*M of the form f = fit fe, where 
fi is constant on each fiber of T*M and fe is linear on each fiber. Then 
fo may be thought of as a section of T**M = TM, that is, as a vector field 
Y; on M. In that case, Xf preserves P and Q(f) acts on elements of the 
half-forms space as 


Q(f) (pom) ® Va) = (bom) @ Va, 


where 


re 
p= th (x + 5 (dive v0) + fi. 
Here divg Yy is the unique function such that Ly, 8 = (dive Y¢) 6. 


A simple calculation in coordinates shows that the vector field Y; in the 

example satisfies X,(~) 0 7) = (Y;w) 07, so that our notation is consistent 
with that in Proposition 23.39 [see (23.23)]. 
Proof. The calculation is precisely the same as in the proof of Theorem 
23.47, except that the decomposition in (23.37) is now global. The claimed 
form of Q(f) is nothing but the expression (23.38), where the reader may 
easily compute, using local coordinates, that —0(X;)— f= fi. ™ 

It is an unfortunate feature of geometric quantization that in the case 
of the vertical polarization on cotangent bundles, it only permits us to 
quantize functions that are at most linear in the momentum variables. In 
a typical physical system having 7*M as its phase space, there will be a 
“kinetic energy” term in the classical Hamiltonian that is quadratic in p. 
To quantize such a system, one has to find a way to quantize the kinetic 
energy term, “by hook or by crook.” 

One approach to this problem is to allow the exponentiated quantized 
Hamiltonian to change the polarization, and then to use pairing maps 
(Sect. 23.8) to “project” back to the Hilbert space for the original polar- 
ization. As explained in Sect. 9.7 of [45], this approach succeeds in the 
case that the kinetic energy term is g(p,p)/(2m), where g is the Rieman- 
nian structure on T*M induced by a Riemannian structure on TM. The 
quantized kinetic energy operator turns out to be given by the map 


wr ((Avyl@) — FRU). (23.39) 
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where A is the Laplacian for M (taken to be a negative operator) and 
where R(x) is the scalar curvature of the Riemannian structure on TM. 
The calculation in [45] glosses over one technical issue, which is that the 
time-evolved polarizations may not be everywhere transverse to the original 
polarization. Nevertheless, the calculation provides a reasonable geometric 
motivation for the formula (23.39). 

It should be emphasized that, because of the projections involved in 
the computation of the quantized kinetic energy operator, it does not sat- 
isfy the desired commutation relations with the quantizations of functions 
whose flow preserves the vertical polarization. Nevertheless, this approach 
to quantizing the kinetic energy may simply be the best one can do. 


23.7 Quantization with Half-Forms: The 
Complex Case 


In the case of a purely complex polarization, half-forms are not “neces- 
sary,” in that we typically have a nonzero Hilbert space even without them. 
Nevertheless, their inclusion gives advantages. In the first place, using half- 
forms makes the complex case more parallel to the real case. In the second 
place, complex quantization with half-forms simply gives better results than 
without half-forms. In the case of the harmonic oscillator, for example, the 
inclusion of half-forms allows (Example 23.53) geometric quantization to 
reproduce precisely the spectrum (n+1/2)hw,n = 0,1,2,..., that we found 
in the traditional treatment. This result should be compared to Proposition 
22.14 without half-forms, where the spectrum is found to be nhw. 

Throughout this section, we assume that (N,w) is a 2n-dimensional 
quantizable symplectic manifold, that (L,V) is prequantum line bundle 
over N, and that P is a Kahler polarization on N (Definition 23.19). Since 
the definitions in the complex case are very similar to those in the real 
case (with a few important differences), we will run through them quickly. 
Since P is no longer equal to P, we need to replace P by P in may of the 
formulas from Sect. 23.6. 

The canonical bundle Kp of P is the complex line bundle for which the 
sections are n-forms a satisfying 


X 1a 


for each vector field X lying in P. Sections of Kp are precisely the (n,0)- 
forms on N. A section of Kp is said to be polarized if 


X .(da) =0 (23.40) 


for every vector field lying in P, or, equivalently, if da = 0. Polarized 
sections of Kp are precisely the holomorphic (n,0)-forms on N. By a square 
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root of Kp we will mean a complex line bundle 6p over N such that dp @dp 
is isomorphic with Kp, together with a particular isomorphism of dp ® dp 
with Kp. Thus, if s; and s2 are sections of dp, we think of s; ® s2 as being 
a section of Kp. We assume that such a square root exists and we fix for 
the remainder of this section one particular square root dp. 

If X is a vector field that preserves P, in the sense of Definition 23.22, 
then Lx preserves the space of sections of Kp and also the space of po- 
larized sections of Kp. The condition (23.40) defining polarized sections of 
Kp can be understood as the vanishing of a partial connection V., defined 
for vector fields lying in P, and given by Vxa = X.(da). Both the partial 
connection (for vector fields lying in P) and the Lie derivative (for vector 
fields preserving P) descend from Kp to 6p, as in Proposition 23.41 in the 
real case. The connection on L and the partial connection on 6p combine 
to give a partial connection on L ® dp. A section s of L @ dp is said to be 
polarized if Vxs = 0 for all vector fields X lying in P. 


Notation 23.49 If 6 is any 2n-form on N, let the expression 
B 


A 


denote the unique function on N such that 8 = (8/A)A, where X is the 
Liouville form in Definition 21.16. 


Unlike the canonical bundle in the real case, the canonical bundle in the 
purely complex case carries a natural Hermitian structure. 


Proposition 23.50 If a is an (n,0)-form on N, then at each point the 
2n-form 
(ieee aAa 


is a non-negative multiple of the Liouville form . There is then a unique 
Hermitian structure on dp with the property that for each section s of dp 
we have 








|s|? = sa x (23.41) 


2 (eee (s@s) A won" 
The factor of 2” in the denominator in (23.41) is inserted for convenience, 
to make certain formulas come out more nicely. 
Proof. See Exercise 17. @ 
Since, by assumption, there is Hermitian structure on L, the above Her- 
mitian structure on dp gives rise in a natural way to a Hermitian structure 
on L ® bp. 


Definition 23.51 The half-form Hilbert space for a Kahler polariza- 
tion P on N is the space of square-integrable polarized sections of L ® dp. 


520 23. Geometric Quantization on Manifolds 


In the C” case, using the canonical 1-form as our symplectic potential, 
elements of the half-form Hilbert space take the form 


e~lImz|?/(2ah) F(z) Q s/dzy A+++ A dZp. 


In this special case, the norm of the half-form factor ./dz, A--- A dzn is 
constant and the half-form Hilbert space is still identifiable with the space 
in Conclusion 22.10. In the case of the unit disk, on the other hand, the 
presence of half-forms alters the inner product; see Exercise 16. 

We now define quantized observables on the half-form Hilbert space, 
using the same formula as in the real case. 


Definition 23.52 If f is a function on N for which Xf preserves P, let 
Q(f) be the operator on the half-form Hilbert space of P given by 


Q(f)s = (Qpre(f) 4) @v—-th p® LY, 


where s is decomposed locally as s = u@v, with being a section of L and 
vy a section of op. 


These operators satisfy [Q(f),Q(g)] /(ih) = Q({f,g}) on the space of 
smooth polarized sections of L ® dp, with the proof of this result being 
identical to the proof of Theorem 23.46 in the real case. If f is real-valued 
and Xf preserves P, then Q(f) will be at least symmetric, assuming we can 
find a dense subspace of the half-form Hilbert space consisting of “nice” 
functions. (Finding dense subspaces is more difficult in the holomorphic 
case than in the real case.) A proof of this claim is sketched in Exercise 18. 


Example 23.53 Consider R? ~ T*R with the Kahler polarization P given 
by the global complex coordinate z = (a — ip/(mw)), for some positive 
number w. Take dp to be trivial with trivializing section Vdz. Consider 
also the harmonic oscillator Hamiltonian H := (p? + (mwa)?)/(2m). Then 
Xy preserves the P and the operator Q(H) on the half-form Hilbert space 
has spectrum consisting of numbers of the form (n + 1/2)hw, where n = 
Oy Vis D caetes 


In this example, w is the frequency of the oscillator and not the canonical 
2-form. 
Proof. The calculation is the same as in the proof of Proposition 22.14, 
except for the addition of the Lie derivative term. A simple calculation 
shows that Lx,,(dz) = iw dz, from which it follows that Ly, Vdz = 
(iw/2)V/dz. It is then easy to see that the set of elements of the form 
emu Im 2|?/ (2h) 2” @ \/dz form an orthonormal basis of eigenvectors for 
Q(f#), with eigenvalues (n+ 1/2)hw. m 
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Pairing maps are designed to allow us to compare the results of quantizing 
with respect to two different polarizations. We consider mainly the case 
of two “transverse” real polarizations; the case of two complex polariza- 
tions or one real and one complex polarization can be treated with minor 
modifications. 

Suppose that P and P’ are two purely real polarizations and that the 
associated leaf spaces =; and =» are oriented manifolds. Suppose also that 
P and P’ are transverse at each point z € N, meaning that P,M P! = 
{0}. If @ and 6 are polarized sections of Kp and Kp:, respectively, the 
transversality assumption is easily shown to imply that a/ (@ is a nowhere- 
vanishing 2n-form on N. Thus, for any point z € N, we can define a bilinear 
“pairing” from dp, X dp’,, > R by 





1/2 
(1 @r1) A(2® “2)) (23.42) 


(11,02) = ( x 


(Recall Notation 23.49.) We can extend this pairing to a pairing oD x 
OD 2 — C that is conjugate linear in the first factor and linear in the second 
factor. Finally, we extend to a pairing of (Lz @ 6§ ,) x (Lz @ 6%, ,) + C by 
setting (114 @V1, 2@v2) equal to (11, f2)(1, v2), where (11, 12) is computed 
with respect to the Hermitian structure on L. 

Let H, and H»2 denote the half-form Hilbert spaces for P and P’, re- 
spectively. Given s; € Hj, and sg € Ho, we define the pairing of s; and 
S52 by 


(81, $2) p py = cf (s1, 52) A, 
N 


provided that the integral is absolutely convergent. Here (s1,s2) is the 
pointwise pairing of s; and s2 defined in the previous paragraph and c is 
a certain “universal” constant, depending only on f and the dimension of 
n, that can be chosen to make certain examples work out nicely. We now 
look for a pairing map Ap,p: : Hy — He with the property that 


(s1, 82) p py => (Ap.pSi, 82) 54, . (23.43) 


If the pairing is bounded (ie., it satisfies |(s1,52) pp, | < C|]s1| ||s2|| for 
some constant C), there is a unique bounded operator App: satisfying 
(23.43). Even if the pairing is unbounded, we may be able to define Ap p: 
as an unbounded operator. 

If we were optimistic, we might hope that the pairing map for any two 
transverse polarizations would be unitary, or at least a constant multiple 
of a unitary map. Jf this were the case, it would suggest that quantization 
is independent of the choice of polarization, in the sense that there would 
be a natural unitary map between the Hilbert spaces for two different 
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polarizations. As it turns out, however, the typical pairing map is not a 
constant multiple of a unitary map. Nevertheless, there are certain special 
cases where the pairing map is unitary (up to a constant), including the case 
of translation-invariant polarizations on R?”. See also [20] for an example of 
a pairing map between a real and a complex polarization that is a constant 
multiple of a unitary map. 

We compute just one very special case of the pairing map between two 
real polarizations. 


Example 23.54 Consider N = R*? ~ T*R and take L to be trivial with 
connection 1-form 0 = p dx. Let P be the vertical polarization, spanned at 
each point by 0/Op, and let P’ be the horizontal polarization, spanned at 
each point by 0/Ox. Then elements s1 of the half-form space for P have the 
form 


si(x,p) = (x) ® Vdx (23.44) 
and elements s2 of the half-form space for P’ have the form 
82(x,p) = (p)e*?/" @ v/dp, (23.45) 
where ¢ and w are functions on R. If c= 1, the pairing is computed as 
(s1,52) pp =— ff Sew (per! de dp. (28.46) 
R 


If 5, has the form (23.44), then Ap,p’(s1) has the form (23.45), where 


wip) =— f oa)er” aa, 


Thus, Ap,p: is a scaled version of the Fourier transform and is, in partic- 
ular, a constant multiple of a unitary map. 


The pairing should be defined initially on some dense subspace of the 

Hilbert spaces, such as the subspaces where ¢@ and w are Schwartz func- 
tions. The pairing map can also be defined initially on the Schwartz space, 
recognized as being unitary (up to a constant), and then extended by con- 
tinuity to all of H,. Once the pairing map is extended to Hj, the pairing 
itself can be defined for all s; € H; and sg € Hg by taking (23.43) as the 
definition of (51,52) p p,. Even though it is possible, as just described, to 
extend the pairing to all of H; x Hp, the integral in (23.46) is not always 
absolutely convergent. 
Proof. The forms (23.44) and (23.45) are obtained by a simple modification 
of the argument in the proof of Proposition 22.8. We can compute that the 
pointwise pairing of dx and \/dp is —1, which gives the indicated form of 
the pairing in (23.46). The pairing may be rewritten as 


[ i b(x)e-iP!" dex W(p) dp, 


which gives the indicated form of the pairing map. m 
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1. Let LZ be a line bundle with connection V over N. Let s be a section of DL 
and let X, and X>2 be two vector fields on N such that X1(z) = X2(z) 
for some fixed point z € N. Show that 


V x, (8)(2) = Vx. (s)(z): 


Hint: Use the assumption that Vrx = fVx. 


2. Let L be a Hermitian line bundle with Hermitian connection V and 
let so be a locally defined section of L such that (so, so) = 1. Given a 
vector field X, let 0(X) be the unique function such that 


V x50 = —10(X) so. 


Show that 0(X) is real valued. 
Hint: Use the Hermitian property of the connection. 


3. Consider the definition of the curvature 2-form w(X,Y) in Defini- 
tion 23.4. 


(a) Show that the expression for w is C°°-linear in each of the vari- 
ables X, Y, and s. That is to say, show that for all smooth 
functions f, we have w(fX,Y)s = fw(X,Y)s, and similarly for 
the variables Y and s. 

(b) Show that the value of w(X,Y)s at a point z depends only on 
the values of X, Y, and s at the point z. 


(c) Show that the value of w(X,Y) at a point z does not depend on 
the value of s at z, provided that s(z) 4 0. 


4. Consider the symplectic form w = dp Adz on R?. Define a purely com- 
plex polarization on R? by taking P, to be the span of the vector 0/0z 
in (22.9), for some fixed a > 0. Show that P is a Kahler polarization. 


5. Let P be the polarization on R? in Exercise 4. Show that the function 
k(x, p) := ap” is a Kahler potential for P. 


6. Suppose that w is a J-invariant 2-form on a complex manifold N. Show 
that w is a (1, 1)-form. (Recall the definitions preceding Lemma 23.34.) 


Hint: Write w = wt +w?, where w! is a (1,1)-form and w? is a sum of 
a (2,0)-form and a (0, 2)-form. Show that 


w? (JX, JY) = —w?(X,Y) 


for all tangent vectors X and Y. 
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7. Suppose that « is a smooth, real-valued function on a complex mani- 
fold N. Show that the 2-form 100k is a real-valued 2-form. 


8. In Example 23.30, verify that @ is a symplectic potential for w, and 
compute 6(0/0Z), where, with z = x — iy, we have 0/0Z = (0/0x — 
id /Oy)/2. Then verify that so(z) := (1—|z|”)!/" satisfies Va/azs0 = 0 
and thus constitutes a global trivializing holomorphic section. 


9. Consider the situation in Example 23.29. Show that the canonical bun- 
dle for P is trivial, with trivializing section dx. Let dp be the (non- 
trivial) bundle described in the paragraph preceding Definition 23.42. 
Since the tensor product of any real line bundle with itself is trivial, 
dp ® dp is isomorphic to Kp. Let dx denote a discontinuous section 
defined over the set 0 < ¢ < 2m such that Vdz® Vdz = dx. Show that 
Vx(dx) = 0 and Vx Vdx = 0 for every vector field lying in P. Now 
show that the Bohr-Sommerfeld leaves (in the half-form sense, for this 
choice of 6p) are the sets of the form {x} x $1, where «/h =n +1/2 
for some integer n. 


10. Let b be a smooth, real-valued function on R and let c be a real 
constant. Show that an operator of the form 


br ih (D(x) yp" (x) + cb'(x)p(a)) 
is symmetric on CS°(R) C L?(R) if and only if c = 1/2. 


11. Let P be a real polarization and let f be a smooth polarized function 
on N, that is, one for which derivatives in the direction of P are 
zero. Show that Q(f) acts on the half-form Hilbert space simply as 
multiplication by f. (Compare Proposition 23.25 in the case without 
half-forms. ) 


Hint: Show that Lx, = 0 whenever a is a polarized section of Kp. 


12. Using the identities Lx yj = [£x,Ly] and Xyp.4} = [Xy, Xo], verify 
the identity (23.36). 


13. Prove that if P is a real polarization on N, it is possible to choose a 
symplectic potential @ locally in such a way that @ is zero on P. 


Hint: Use functions f, as in the proof of Proposition 23.26. 


14. Suppose that P is a purely real polarization on N and @ is a local 
symplectic potential that vanishes on P. Suppose also that f is a real- 
valued function for which Xy preserves P. Show that the function 
—0(Xy) — f is constant along the leaves of P. 


Hint: If X is a vector field lying in P, use (21.6) to show that X(0(Xy)) = 
do(X, Xf). 


15. 


16. 


17. 


18. 
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Suppose that @ is a nowhere vanishing n-form on an oriented manifold 
=, that X is a real vector field on =, and that ¢ and w are smooth, 
compactly supported functions on =. Verify the following formula for 
“integration by parts”: 


[xow 8 =- [ o¢xe) 8— [ ow(aive x) 8 


where divg X is the function such that Cx 6 = (divg X)P. 


Hint: If ®, is the flow generated by X, then for all sufficiently small 
t, ®,(a) is defined for all x in the support of dw and the integral of 
(®,)*(@WB) over = is independent of t. 


Let the notation be as in Exercise 8. Then the canonical bundle for 
P is trivial, with trivializing section dz. Take dp to be trivial, with 
trivializing section /dz. Show that every polarized section s of L@ dp 


is of the form = 
s = F(z)so(z) @ Vaz, 


where F is holomorphic. Show that the norm of such a section is, up 
to a constant, the L? norm of F with respect to a measure of the form 
(1 —|z|?)”, but that the value of v is not the same as when half-forms 
are not included. 


Let P be a Kahler polarization on N, let z1,..., Zn be holomorphic 
local coordinates on N, and let A be the matrix given by 


(a) Show that the matrix 7A is positive definite. 
(b) Show that w = Ajr dz; A dz. 
(c) Show that the quantity w®”"/n! may be computed as 


det (éA)(—1)"\— 9/2 (4) "dz A-+ +» Ady Adz A+++ Adin. 


(d) Verify Proposition 23.50. 


Let P be a Kahler polarization on N, let dp be a fixed square root of 
Kp, and let f be asmooth, real-valued function such that Xf preserves 
P. Throughout this problem, if s; and s2 are local sections of a line 
bundle, with sz nonvanishing, s1/s2 will denote the unique function 
such that s; = (s1/s82)s9. 


(a) Show that for any continuous compactly supported function w 
on N, we have 


[ xr a=o. 
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Hint: Use Liouville’s theorem. 
Note: The same result holds if w is not compactly supported but 
is “sufficiently nice.” 


(b) If v is a local nonvanishing section of dp, show that 


Ly _ 1Lx,(v@v) 
yp 2 VQvV 





(c) If a is any 2n-form on N, show that 


ft 2(9) 





‘ea 
~ 


Suppose s; and sg are polarized sections of L ® dp, decomposed 
locally as sj = bj; ® vj, 7 = 1,2. Show that 


iX ($1, $2) = (i(V x, 1) @ 4, $2) + (ips @ (Lx ,14) ® 82) 
+ (s1,7(V x, H2) & V2) “+ (81, tpe & (Lx;V2)), 


where (-,-) is computed with respect to the Hermitian structure 
on L ® 6p described in Sect. 23.7. 

Hint: Use the identity Lx, (aA 8) = (Lx,a)N\B+ad (Lx, 6). 
Suppose s; and 59 are polarized sections of L ® dp belonging to 
the domain of Q(f) and such that (s1, sz) is “sufficiently nice.” 
Show that 


— 
oO 
Na 


(81, Q(f)s2) = (Q(f)s1, $2) - 


Appendix A 


Review of Basic Material 


A.1 Tensor Products of Vector Spaces 


Given two vector spaces V; and V2 over C, the tensor product is a new vector 
space V| @V2, together with a bilinear “product” map ® : Vi x Vz — Vi @V3. 
If Vi and V2 are finite dimensional with bases {u;} and {vx}, then Vi @ V2 
is finite dimensional with {u; ®v,} forming a basis for Vj @ V2. In the finite- 
dimensional case, we could simply define the tensor product by this basis 
property, but then we would have to worry about whether the construction 
is basis independent. Instead, we define V, ®@ V2 by a “universal property.” 


Definition A.1 Suppose V, and V2 are vector spaces over a field F. Then 
a tensor product of Vi; and V2 is a vector space W over F together with 
a bilinear map T : Vi, x V2 > W having the following “universal property”: 
If U is any vector space over F and ® : Vi x V2 > U is a bilinear map, 
then there exists a unique linear map ® : W — U such that the following 
diagram commutes: 


Proposition A.2 For any two vector spaces V; and V2, a tensor product 
of Vi and V2 exists and is unique up to “canonical isomorphism.” That is, 
for two tensor products (W,,T,) and (W2,T>2), there is a unique invertible 
linear map UV: W, > We such that To = WoT). 
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In light of the uniqueness result, we may speak of “the” tensor product of 
V, and V2. We choose any one tensor product and we denote it by V; ® V3. 
We also denote the linear map T : Vi x V2 > Vi ® V2 as (u,v) BH u@v. In 
this notation, the universal property reads as follows: Given any bilinear 
map ® of V, x V2 into a vector space U, there exists a unique linear map 
&:V, @ Vo > U such that 


&(u @ v) = G(u,v). 


Proposition A.3 If V, and V2 are finite-dimensional vector spaces with 
bases {uj} +, and {vp}p21, then Vi @ Vo ts finite dimensional and the set 
of elements of the form u; ® vg, 1 <j Sm, 1<k < no, forms a basis for 
V, ® V2. In particular, 


dim(V; @ V2) = (dim V;) (dim V9). 





It should be emphasized that, in general, not every element of Vi @ V2 
is of the form u @ v with u € V, and v € Vo. All we can say is that each 
element of V; ® V2 can be decomposed as a linear combination of elements 
of the form u ® v. This decomposition, furthermore, is far from canonical; 
even in the finite-dimensional case, it depends on a choice of bases for Vj 
and V2. Nevertheless, the universal property of the tensor product tells us 
that we can define linear maps from V; ® V2 to any vector space U, simply 
by defining them on elements of the form u ® v. Provided that ®(u,v) is 
bilinear in u and v, the universal property tells us that there is a unique 
linear map ® on V, ® V2 such that on element of the form u® v, ® is equal 
to ®(u,v). A representative application of the universal property is in the 
following result. 


Proposition A.4 If A € End(V;) and B € End(V2), there exists a unique 
linear map A® B: Vi, ® Vz 4 V, @ Vo such that 


(A® B)(u® v) = (Au) @ (Br). 
For A, Ag € End(V) and B,, Bo € End(V2), we have 
(A; ® B,)(Azg ® Bg) = (Ai A2) ® (B, Bo). 


To construct A ® B, we apply the universal property with U = V, ® V2 
and ®(u,v) = (Au) @ (Bv). Since A and B are linear and @ is bilinear, ® 
is bilinear. The linear map @: Vi ® Vo > V, ® Vo is then the map that we 
denote A ® B. 

The tensor product, as we have defined it in this section, applies to 
all vector spaces, whether finite dimensional or infinite dimensional. The 
construction, however, is purely algebraic; if there is a topology on V, and 
V2, the tensor product takes no account of that topology. In the Hilbert 
space setting, then, we will have to refine the notion of the tensor product 
so that the tensor product of two Hilbert spaces will again be a Hilbert 
space. See Sect. A.4.5. 
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It is assumed that the reader is familiar with the basic notions of measure 
theory, including the concepts of o-algebras, measures, measurable func- 
tions, and the Lebesgue integral. A triple (X,, w), consisting of a set X, a 
o-algebra 2 of subsets of X, and a (non-negative) measure p on 2 is called 
a measure space. A measurable function w : X — C is said to be integrable 
if fy |@| du < co. The o-algebra generated by any collection of subsets of a 
set X is the smallest o-algebra of subsets of X containing that collection. 

We assume those parts of measure theory that are entirely standard: the 
monotone convergence and dominated convergence theorems, L? spaces, 
and Fubini’s theorem. We briefly review a few other topics that might not 
be as familiar. 

A measure js on a measurable space (X, 1) is said to be o-finite if X can 
be written as a countable union of measurable sets of finite measure. 


Definition A.5 Suppose p and v are two o-finite measures on a measure 
space (X,). Then we say that yp is absolutely continuous with respect 
to v if for all E € Q, if v(E) = 0 then p(E) = 0. We say that u and v 
are equivalent if each measure is absolutely continuous with respect to the 
other. 


Theorem A.6 (Radon—Nikodym) Suppose ts and v are two o-finite 
measures on a measure space (X,Q) and that p is absolutely continuous 
with respect to v. Then there exists a non-negative, measurable function p 
on X such that 


WE) = | p dv 
E 
for all E € Q. The function p is called the density of uw with respect to v. 


Definition A.7 A collection M of subsets of a set X is called a mono- 
tone class if M is closed under countable increasing unions and countable 
decreasing intersections. 


A countable increasing union means the union of a sequence FE; of sets 
where £; is contained in Ej, for each j, with a similar definition for 
countable decreasing intersections. 


Theorem A.8 (Monotone Class Lemma) Suppose M is a monotone 
class of subsets of a set X and suppose M contains an algebra A of subsets 
of X. Then M contains the o-algebra generated by A. 


Corollary A.9 Suppose 4 and v are two finite measures on a measure 
space (X,Q). Suppose 4 and v agree on an algebra A CQ. Then pt and v 
agree on the a-algebra generated by A. 


Note that in general, the collection of sets on which two measures agree 
is not a o-algebra, nor even an algebra. 
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Theorem A.10 Suppose pu is a measure on the Borel a-algebra in a locally 
compact, separable metric space X. Suppose also that (KK) < oo for each 
compact subset Kk of X. Then the space of continuous functions of compact 
support on X is dense in L?(X, 1), for all p with 1 < p< oo. 


A word of clarification is in order here. If q is a continuous function on 
X with compact support, then Jy |~|’ dy is finite, since ~ is bounded and 
is finite on compact sets. Thus, we can define a map from C,(X) into 
L?(X,) by mapping a continuous function w of compact support to the 
equivalence class [W]. The theorem is asserting, more precisely, that the 
image of C.(X) under this map is dense in L?(X, jz). It should be noted, 
however, that the map w +> [w] need not be injective. After all, if there 
is a nonempty open set U inside X with u(U) = 0, then for any w with 
support contained in U, the equivalence class [i] will be the zero element of 
L?(X, y). Nevertheless, we will allow ourselves a small abuse of terminology 
and say that C.(X) is dense in L?(X, i). 


A.3 Elementary Functional Analysis 


In this section, we briefly review some of the results from elementary func- 
tional analysis that we make use of the text. Most of these results can be 
found in the book of Rudin [32]. 


A.8.1 The Stone-Weiterstrass Theorem 


The Weierstrass theorem states that every continuous, real-valued function 
on an interval can be uniformly approximated by polynomials. A substan- 
tial generalization of this was obtained by Stone. If X is a compact metric 
space, let C(X;IR) and C(X;C) denote the space of continuous real- and 
complex-valued continuous functions, respectively. A subset A of C(X;F) 
is called an algebra if it is closed under pointwise addition, pointwise mul- 
tiplication, and multiplication by elements of F, where F = R or C. An 
algebra A is said to separate points if for any two distinct points x and y 
in X, there exists f € A such that f(x) 4 f(y). We use on C(X;F) the 


supremum norm, given by 
IIfllaup “= sup |f(@)] 
£ExX 


and C(X,F) is complete with respect to the associated distance function, 


d(f,9) =F — gllsup - 


Theorem A.11 (Stone—Weierstrass, Real Version) Let X be a com- 
pact metric space and let A be an algebra in C(X;R). If A contains the 
constant functions and separates points, then A is dense in C(X;R) with 
respect to the supremum norm. 
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Theorem A.12 (Stone—Weierstrass, Complex Version) Let X be a 
compact metric space and let A be an algebra in C(X;C). If A contains the 
constant functions, separates points, and is closed under complex conjuga- 
tion, then A is dense in C(X;C) with respect to the supremum norm. 


A consequence of the complex version of the Stone—Weierstrass theorem 
is the following: If K is a compact subset of C, then every continuous, 
complex-valued function on K can be uniformly approximated by polyno- 
mials in z and 2. 


A.3.2 The Fourier Transform 


We now describe the Fourier transform on R”, in various forms. 
Definition A.13 For any ~ € L1(R"), define the Fourier transform of 
w to be the function = on R” given by 


Co 


b(k) = (20)-"? 7 e®*a(x) dx. 


Proposition A.14 For any 7 € L1(R"), the Fourier transform w of w has 
(ls)| < (2n)-"/2 ||wll,1, (2) 0 is continuous, 
and (3) #(k) tends to zero as |k| tends to oo. 





the following properties: (1) 


The bound on 7 is obvious and the continuity of ~) follows from dom- 
inated convergence. To show that w tends to zero at infinity, we first es- 
tablish this on a dense subspace of L1(R") (e.g., the Schwartz space; see 
below) and then take uniform limits. 


Definition A.15 The Schwartz space S(R”) is the space of all C@ func- 
tions w on R” such that 


lim |xwd*y(x)| = 0 


L—->xrCo 





for all n-tuples of non-negative integers j and k. Here if j = (j1,---;Jn) 
Jt... 95> and 
n 


then x3 = x} 
. ra) ju FA) jn 
a=(5.) = : 


An element of the Schwartz space is called a Schwartz function. 
Proposition A.16 [fw belongs to S(R"), then w also belongs to S(R"). 


The proof of this result hinges on the behavior of the Fourier transform 
under differentiation and under multiplication by x, results which are of 
interest in their on right. 
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Proposition A.17 Jf w is a Schwartz function, the following properties 
hold 


1. We have _ 
Ow 


Ba, *) = ikjw(k). (A.1) 


2. The function w is differentiable at every point and the Fourier trans- 
form of the function x;(x) is given by 


@)0(k) = i(k). (A.2) 


The first point is proved by integration by parts and the second by dif- 
ferentiation under the integral in the definition of 4. 


Theorem A.18 (Fourier Inversion and Plancherel Formula, I) The 
Fourier transform on S(IR") has the following properties. 


1. The Fourier transform maps the Schwartz space onto the Schwartz 
space. 


2. For all / € S(R”), the function Ww can be recovered from its Fourier 
transform by the Fourier inversion formula: 


We) = nym? fe *I(R) de 


—co 


3. For all € S(R"), we have the Plancherel theorem: 
[Wwoor ax= [pido ak. 


Since the Schwartz space is dense in L?(IR”), the BLT theorem and Theo- 
rem A.18 imply that the Fourier transform extends uniquely to an isometric 
map of L?(R”) onto L?(R”). 


Theorem A.19 (Fourier Inversion and Plancherel Theorem, IT) 
The Fourier transform extends to an isometric map F of L?(R") onto 
L?(R"). This map may be computed as 


F(w)(k) = (20)7"/? lim e *Xab(x) dx, (A.3) 


where the limit is in the norm topology of L?(IR"). The inverse map F~* 
may be computed as 


(F-*f) (x) = (20)-"”? lim e'* f(k) dk. 
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If w belongs to L1(R") 9 L?(R"), then by dominated convergence, the 
limit in coincides with the L! Fourier transform in Definition A.13. 


Definition A.20 For two measurable functions ¢ and w, define the con- 
volution ¢*wW of d and w by the formula 


(p*)(x) = i d(x — y)v(y) dy, 


provided that the integral is absolutely convergent for all x. 


Proposition A.21 Suppose that ¢ and w belong to L(R")NL?(R"). Then 
@x*w is defined and belongs to L1(IR")M L?(R") and we have 


(20) "PF (p+ b) = F(O)F (yp). 
This result is proved by plugging ¢ * w into the definition of the Fourier 


transform, writing e as e~** ¥e—** (x-¥) | and using Fubini’s theorem. 
We will have occasion to use the following Gaussian integral. 


—ik-x 


Proposition A.22 For alla >0 and be C, we have 


1 - x? /(2a) (ba b? 
= e e dr = Jae /?. 
V 2r [. ya 

Taking b = 7k in the last part of the, proposition gives us the Fourier 
transform of the Gaussian function e~* /?%. Taking b = 0 allows us to 
determine the proper normalization of the Gaussian probability density. 


A.3.8 Distributions 


In this section we give a brief account of the theory of distributions—what 
physicists call “generalized functions” —including the notion of “derivative 
in the distribution sense.” 

The idea is that we study functions by studying their integral against 
some class of very nice “test functions.” Consider, for example, a locally 
integrable function f and consider integrals of the form 


J x0f00 dx (A) 


where y belongs to CS°(R”), the space of smooth, compactly supported 
functions. We might think, for example, that x is positive, has integral 
equal to 1, and is supported near some point a € R”. In that case, the 
integral (A.4) is an approximation to the value of f at a, what physicists 
describe as a “smeared out” version of f(a). 
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Proposition A.23 Suppose f; and f2 are locally integrable functions on 
R”. If 
J xeoe9 ax= [x0 fal) dx 
n Rr 
for all x € CS°(R"), then fi(x) = fo(x) for almost every x. 


The idea now is that we allow objects that do not have values at points, 
but for which something like (A.4) makes sense. Mathematically, we think 
of (A.4) as a linear functional on Co°(R"). 


Definition A.24 A sequence ym € Co°(R”) is said to converge to x € 
CS(R”) if (1) there exists a single compact set K containing the support 
of all the Xn’s, (2) Xm converges uniformly to x, and (8) each derivative 
of Xm converges uniformly to the corresponding derivative of x. 


Definition A.25 A distribution on R” is a linear map T : C?(R") 3 C 
having the following continuity property: If ym converges to x in the sense 
of Definition A.24, T (Xm) converges to T(x). 


The continuity condition on T should be regarded as a technicality, in 
that any functional that is well defined and linear on all of C'o°(IR") and is 
obtained in a reasonably constructive fashion will satisfy this property. 


Example A.26 The Dirac 6- “function” is the distribution 6 defined by 


Definition A.27 If T is a distribution and f is a locally integrable func- 
tion, the expression “T is equal to f” or “T is given by f” means that 


for all x € C>°(R”). 
Definition A.28 If T is a distribution, define the distribution OT /Ox; by 


the formula 
OT Ox 
Ox; Oe (ss) 


It is easy to verify that if J’ has the continuity property in Definition 
A.25, then so does OT /0x,;. Furthermore, if T is given by a continuously 
differentiable function, then the derivative of T is in the distribution sense 
coincides with the derivative of T in the classical sense, as can easily be 
shown using integration by parts. If T' is a distribution, we may define AT 
by repeated applications of Definition A.28, with the result that 


(AT)(x) = T(Ax). 
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Proposition A.29 If ¢ and w are L? functions, the equation Oy/Ox; = 
holds in the distribution sense if and only if 


for all x € CS? (R"). Similarly, the equation Ay = ¢ holds in the distribu- 
tion sense if and only if 


(Ax, b) = (x, $) 
for all x € CS°(R"). 
Proposition A.30 If T is a distribution on R and dT/dzx is the zero dis- 


tribution, then T is a constant, meaning that there is some constant c such 
that 


T(x) = [- x(a)e da. (A.5) 


—co 


Suppose, in particular, that if T is given by a locally integrable function f, 
and the derivative of T is zero. Then Proposition A.30 tells us that for some 
constant c, we have f° y(x)(f(a) — c) dx = 0 for all yx € Co°(R). Then 
Proposition A.23 tells us that f(x) = c almost everywhere. This means that 
if the derivative of f is zero, even in the weak (or distributional) sense, then 
f must be constant. 


A.3.4 Banach Spaces 


In this section, we define Banach spaces and describe some of their elemen- 
tary properties. 


Definition A.31 A norm on a vector space V over F (F = R or C) is a 
map from V into R, denoted | + ||v|| , with the following properties. 


1. For all b € V, ||w|| > 0, with equality if and only if » = 0. 
2. For all € V andceéF, we have |\cw|| = |c| ||w|| . 


3. For all 6, €V, we have || +4] < ||¢I| + ||]. 


If ||-|| is a norm on V, then we can define a distance function d on V by 


setting d(¢, ) = ||v — ll. 


Definition A.32 A normed vector space is said to be a Banach space 
if it is complete with respect to the associated distance function. A Banach 
space is said to be separable if contains a countable dense subset. 


One important class of examples of Banach spaces are the L? spaces. 
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Definition A.33 An infinite series, \~_, Un, with values in normed space 
V, is said to converge if there exists some L € V such that 


lim |S, — LI] = 0, 
N-oco 
N 
where Sn = >>, Wn- 


Proposition A.34 If V is a Banach space, then absolute convergence im- 
plies convergence in V. That is, if 


bs In|] < 00, 
n=1 


then >>, Wn converges in V. 


Definition A.35 If V, and V2 are normed spaces, a linear map T : V; > 
V2 is bounded if 
|Z" 


sup ; A.6 
vevi\{o} Ill ae 


IfT is bounded, then the supremum in (A.6) is called the operator norm 
of T, denoted ||T'|| . 


Theorem A.36 (Bounded Linear Transformation Theorem) Let Vi 
be a normed space and V2 a Banach space. Suppose W is a dense subspace 
of Vi and T : W — V3 is a bounded linear map. Then there exists a unique 
bounded linear map T : Vi —> V2 such that Tlw = T. Furthermore, the 
norm of T equals the norm of T. 


Definition A.37 If V is a normed space over F (F = R or C), then a 
bounded linear functional on V is a bounded linear map of V into F, 
where on F we use the norm given by the absolute value. The collection of 
all bounded linear functionals, with the norm given by (A.6), is called the 
dual space to V, denoted V*. 


Theorem A.38 [If V is a normed vector space, then the following results 
hold. 


1. The dual space V* is a Banach space. 
2. For all w € V, there exists a nonzero € € V* such that 
IEC) = Ue el - 
In particular, if E(w) =0 for all€ € V*, thenw =0. 


Theorem A.39 (Closed Graph Theorem) Suppose that Vi is a Banach 
space and Vz a normed vector space. For any linear map T : Vi — Va, let 
Graph(T) denote the set of pairs (W,Tw) in Vi x V2 such that wy € Vi. If 
the graph of T is a closed subset of Vi x V2, then T is bounded. 
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Here is a simple example of how the closed graph theorem can be applied. 
Suppose V; and V2 are Banach spaces and T : V; — V2 is a linear map that 
is one-to-one, onto, and bounded. Then the inverse map JT! : Vo > V, is 
automatically bounded. To verify this, we first check that if T is bounded, 
then the graph of T is closed (easy). Then we observe that the graph of 
T~' is also closed, since it is obtained from the graph of T by the map 
(¢,v) + (wv, ¢). Thus, the theorem tells us that T~+ is bounded. 


Theorem A.40 (Principle of Uniform Boundedness) Suppose {Ty} 
is any family of bounded linear maps from a Banach space V, to a normed 
space Vz. Suppose that for each w € Vi, there is a constant Cy such that 
|Taw|| < Cy for alla. Then there exists a constant C' such that ||Ty|| <C 
for all a. 


That is, in contrapositive form, if the family {T,,} is unbounded, {Tyw} 
must be unbounded on w for some w € Vj. 


Corollary A.41 Suppose V is a Banach space and E is a nonempty subset 
of V. Suppose that for all € € V* there exists a constant Ce such that 
|E(a)| < Ce for all € E. Then E is a bounded set. 


The corollary is obtained by identifying each w € V with the linear map 
ey : V* — C given by evaluation on vy; that is, ey (€) = €(w). Note that by 
Point 2 of Theorem A.38, the norm of ey, as an element of V** is equal to 
the norm of w as an element of V. 
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A.4.1 Inner Product Spaces and Hilbert Spaces 


We now introduce a generalization to arbitrary vector spaces over R or C 
of the usual inner product (or dot product) on R”. 


Definition A.42 An inner product on a vector space over F (F = R or 
C) is a map (-,:): V x V + F with the following properties. 


1. For all 6, € V, we have (w,¢) = (¢,V). 


2. For all @ € V, (¢, ¢) is real and non-negative, and (¢,¢) = 0 only if 
o=0. 


3. For all é,b €V andce F, we have (cd, Wb) = €(d, v) and (d, cw) = 
c(p, p) 


4. For all ¢,y,x © V, we have (6+, x) = (¢,.x) + (w, x) and 


(O,0+x) = (¢,0) + (,Xx)- 
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Note that we are following the physics convention of taking the complex 
conjugate in Point 3 of the definition on the first factor in the inner product. 


Proposition A.43 If V is an inner product space, then for all ¢,w € V, 
we have the Cauchy—Schwarz inequality: 


\(d, w)? < (6, 6), w) . 


Furthermore, #f ||-|| : V + R is defined by 


ll = Vv, Y), (A.7) 


then ||-|| is a norm on V. 


Definition A.44 A Hilbert space is a vector space H over R or C, 
equipped with an inner product (-,-), such that H is complete in the norm 
given by (A.7). 


That is to say, a Hilbert space is a Banach space in which the norm 
comes from an inner product. In Appendix A.4 only, we allow H to denote 
an arbitrary Hilbert space over R or C. (In the main body of the text, H 
denotes a separable complex Hilbert space.) 


Definition A.45 Suppose H; is a sequence of separable Hilbert spaces. 
Then the Hilbert space direct sum, denoted 


is the space of sequences w = (1, W2, W3,-..) such that wy, € H, and such 
that 


wir = py lIvsll5 < 00. (A.8) 


The finite direct sum of the H;’s is the set of W = (W1, 2, W3,...) such 
that ~; =0 for all but finitely many values of j. 
We define an inner product on the direct sum by setting 


(6,0) = 5 (b5, 04) (A.9) 


j= 


B 


for all 6, € H. This inner product is well defined and H is complete with 
respect to this inner product, and hence a Hilbert space. 

One important example of a Hilbert space is L?(X, 4), where (X, 1) isa 
measure space. 
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Definition A.46 If (X,) is a measure space, define an inner product on 
L?(X, pu) by the formula 


(6,0) = I Bay(a) dpa). (A.10) 


A standard result in measure theory states that the integral on the right- 
hand side of (A.10) is absolutely convergent for all # and w in L?(X, 11). 
It is then easy to verify that (-,-) is indeed an inner product on L?(X, 1). 
Another standard result states that L?(X,) is complete with respect to 
the norm associated with the inner product in (A.10); thus, L?(X, 2) is a 
Hilbert space. 


A.4.2. Orthogonality 


One reason that Hilbert spaces are nicer to work with than general Banach 
spaces is that we have the concept of orthogonality. 


Definition A.47 Two elements @ and wW of an inner product space are 
orthogonal if (¢,) = 0. 


Definition A.48 If V is any subspace of H, define a subspace V+ of H 
by 
V+ ={¢ EH (¢,v) =0 for all p € VV}. 


Then V+ is called the orthogonal space of V. 
Proposition A.49 


1. If V is a closed subspace of H, every w € H can be decomposed 
uniquely as = Wy, + We, with W, © V and 2 € Vt. 


2. If V is any subspace of H, then (V+)+ =V, where V is the closure 
of V. In particular, if V is closed, then (V+)+ =V. 


If V is closed, we call V+ the orthogonal complement of V. 


Definition A.50 A set {e;} of elements of H, where j ranges over an 
arbitrary index set, is said to be orthonormal if 


wed af 0 FE. 


An orthonormal set {e;} is an orthonormal basis for H if the space of 
finite linear combinations of the e;’s is dense in H. 


If H = L?([-L, L]), for some positive number L, then the functions, 
1 in: 
j= — ee oH EZ, (A.11) 


form an orthonormal basis for H. 
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Proposition A.51 Suppose {e;} is an orthonormal basis for H. Then ev- 
ery W can be expressed uniquely as a convergent sum 


y= Saye, (A.12) 
3 


where the coefficients are given by a; = (e;,W). If w is as in (A.12), then 
Il? = So layl*. 
J 


Finally, if (aj) is any sequence such that >), |a,;|? < oo, there exists a 
unique W € H such that (e;,w) =a; for all j. 


In the case that the orthonormal basis is the one in (A.11), the resulting 
series (A.12) is called the Fourier series of w. 


A.4.8 The Riesz Theorem and Adjoints 


We let B(H) denote the space of bounded linear maps of H to H. It is not 
hard to show that B(H) forms a Banach space under the operator norm. 


Theorem A.52 (Riesz Theorem) /f € : H > C is a bounded linear 
functional, then there exists a unique x € H such that 


Eb) = (x, ¥) 


for all W © H. Furthermore, the operator norm of € as a linear functional 
is equal to the norm of x as an element of H. 


We now turn to the concept of the adjoint of a bounded operator, along 
with the related concept of quadratic forms on H. 


Proposition A.53 For any A € B(H), there exists a unique linear oper- 
ator A*:H—H, called the adjoint of A, such that 


(9, Ap) = (A*¢, v) 
for all ¢,w € H. For all A, B € B(H) anda, 6 € C we have 


a) = 
(AB)* = B* A* 
(aA + BB)* = aA* + BB* 
i =f, 


The operator A* is bounded and ||A*|| = ||All . 
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Since A is a bounded operator, the map w +> (¢, Ay) is a bounded linear 
functional for each fixed ¢ € H. The Riesz theorem then tells us that there 
is a unique y € H such that (¢, AW) = (vy, v) . The operator A* is defined 
by setting A*¢ = y. It is not hard to check that this definition makes A* 
into a bounded linear operator. 


Definition A.54 An operator A € B(H) is said to be self-adjoint if 
A* = A and skew-self-adjoint if A* = —A. 


Definition A.55 An operator U on H is unitary if U is surjective and 
preserves inner products, that is, (Ud,UwW) = (¢,¥) for all ¢,w € H. 


If U is unitary, then U preserves norms (||U%|| = ||q|| for all ~ € H); 
therefore, U is bounded with ||U|| = 1. By the polarization identity (Propo- 
sition A.59), if U preserves norms, then it also preserves inner products. 


Proposition A.56 A bounded operator U is unitary if and only if U* = 
U—!, that is, if and only if UU* = U*U =I. 


Proposition A.57 For any closed subspace V C H, there is a unique 
bounded operator P such that P = I on V and P = 0 on the orthogonal 
complement V+. This operator is called the orthogonal projection onto 
V and it satisfies P? = P and P* =P. 

Conversely, if P is any bounded operator on H. satisfying P? = P and 
P* = P, then P is the orthogonal projection onto a closed subspace V, where 
V = range(P). 


A.4.4 Quadratic Forms 


In this section, we develop the theory of quadratic forms on Hilbert spaces. 
Since this is customarily done only for the inner product itself, we include 
the proofs of the results. 


Definition A.58 A sesquilinear form on H is a map L:HxH>C 
that is conjugate linear in the first factor and linear in the second factor. 
A sesquilinear form is bounded if there exists a constant C' such that 


IL(¢, 0)| < C loll lle 
for all 6, © H. 


Proposition A.59 If L is a sesquilinear form on H, L can be recovered 
from its values on the diagonal (i.e., the value of L(w,w) for various w’s) 
as follows: 


L(¢,¥) = 5 [L(+ ¥,6+ ¥) — L(g, 6) — LW, ¥)] 


[L(o + iv, @ + ih) — L(d, o) — L(iy, t)]. (A.13) 


NLS pole 
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This formula is known as the polarization identity. 


Note that we do not assume any relationship between L(¢, W) and L(w, @). 
Proof. Direct calculation. 


Definition A.60 A quadratic form on a Hilbert space H is a map Q : 
H = C with the following properties: (1) Q(Aw) = |A\? Q(w) for ally € H 
and X € C, and (2) the map L: H x H > C defined by 
1 
1(6,¥) = $1Q(6-+ ¥) - Q(@) - QW)] 
a 


5 2+ i) — Q() - QGv) 





is a sesquilinear form. A quadratic form Q is bounded if there exists a 
constant C such that 


|2(¢)| < C|lall? 
for all @ € H. The smallest such constant C is the norm of Q. 


Proposition A.61 If Q is a quadratic form on H and L is the associated 
sesquilinear form, we have the following results. 


1. For all) © H, we have Q(w) = L(v, ). 
2. If Q is a bounded, then L is bounded. 


3. If Q(w) belongs to R for all W € H, then L is conjugate symmetric, 
that is, 


L(g, b) = L(Y, 4) 
for all ¢,v © H. 


Proof. Point 1 of the proposition is verified by taking @ = w in the expres- 
sion for L(@,w) and then using the relation Q(Aw) = |A|? Q(w). For Point 
2, suppose |Q(ib)| < C ||w||° for all » € H. If |Id|| = |||] = 1, then 6 + ~ 
and @¢+ iw have norm at most 2, and so 


1 
LGW S sC4+1414+44+141) =6C. 


Now, for any ¢ and 7 in H, we can find unit vectors é and w such that 
= ||d|| ¢ and w = ||w|| /. Then since L is assumed to be sesquilinear, we 
have 


IL(¢,¥)| = loll loll £ (4.8) < 6Cl9I| wl 


showing that DL is bounded. 
For Point 3, assume that @Q(w) is real for all ~ € H and define a map 
M:HxH-Rby 





M(6,¥) = 5 (Q(6+ ¥) ~ Q(¢) ~ Q(H)] = Re[L(9,¥)]. 
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Then M is real-bilinear (because it is the real part of ZL) and symmetric 
(because of the expression for M in terms of Q). Furthermore, M (id, iW) = 
M(¢,w). These properties of M show that M(¢, iW) = —M(w, id), and so 


L(9,b) = M(o,%) — iM(¢, ib) 
= M(, 4)+iM(h id) 
= L(y, ¢), 





which is what we wanted to prove. m 


Example A.62 Jf A is a bounded operator on H, one can construct a 
bounded quadratic form Qa on 1 by setting 


Qa(y) =v, Av), PeH. 


The associated sesquilinear form L, is then given by 


La(¢,¥) = (¢, Ab), oH eH. 


Proposition A.63 If Q is a bounded quadratic form on H, there is a 
unique A € B(H) such that Q(wW) = (w, Ay) for all» € H. If Q(w) belongs 
to R for all » € H, then the operator A is self-adjoint. 


Proof. Since Q is bounded, L is also bounded, meaning that there exists 
a constant C’ such that |L(¢,w)| < C ||¢|| |||] for all ¢,y) € H. Thus, for 
any @ € H, the linear functional | + L(¢,w) is bounded, with norm at 
most C'||¢||. By the Riesz theorem, then, there exists a unique y € H, 
with ||x|| < C||@||, such that L(¢,w) = (y,wW). We now define a map 
B:H-H by defining B¢ = x. Direct calculation shows that B is linear, 
and the inequality ||x|| < C' |||] shows that B is bounded. Setting A = B* 
establishes the existence of the desired operator. Uniqueness of A follows 
from the observation that if (¢, AW) = 0 for all ¢,v € H, then A is the 
zero operator. 

If Q(w) is real for all ~ € H, then by Point 3 of Proposition A.61, L is 
conjugate symmetric. Thus, 





(¢, Ap) = L(¢,) = Lv, 9) = (, Ad) = (Ad, Y) 


for all ¢,~ € H, showing that A is self-adjoint. m 


A.4.5 Tensor Products of Hilbert Spaces 


Recall from Appendix A.1 the concept of the tensor product of two vector 
spaces. 
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Proposition A.64 Suppose V; and V2 are inner product spaces, with inner 
products (-,-), and (-,-)5. Then there exists a unique inner product (-,-) on 
V, ® Vo such that 


(ui @ V1, U2 @ v2) = (U1, U2); (V1 @ v2)o 
for all uy, ug € Vi and vj, v2 € V2. 


If H, and Hg are Hilbert spaces, then we can equip the tensor product 
H, ®Hp with the inner product in Proposition A.64. If H, and Hz are both 
infinite dimensional, however, H,; ® H2 will not be complete with respect 
to this inner product. Nevertheless, we can complete H; ® Hz with respect 
to this inner product, thus obtaining a new Hilbert space. 


Definition A.65 If H, and Hg are Hilbert spaces, then the Hilbert ten- 
sor product of H, and H2, denoted H,®Hg, is the Hilbert space obtained 


by completing H, ® Hz with respect to the inner product in Proposition 
A.64. 


Proposition A.66 Jf H; and Hz are Hilbert spaces with orthonormal 
bases {e;} and {fx}, respectively, then {e; ® fx} is an orthonormal basis 
for the Hilbert space H,®Hp. 


Proposition A.67 If A is a bounded operator on Hy, and B is a bounded 
operator on Hy», then there exists a unique bounded operator on H,@Ho, 
denoted A® B, such that 


(A ® B)(¢ @ y) = (Ag) @ (By) 
for all @ € Hy and w € He. 


To see that A® B is bounded, first write A® B as (A@I)(1@ B). Then, 
given any orthonormal basis { f;} for Hz, we can decompose Hy ®@Hb> as the 
Hilbert space direct sum of subspaces of the form H, ® f;. The operator 
A®TI acts on this decomposition as a block-diagonal operator with A in 
each diagonal block. From this, it is easy to verify that ||A @ I|| = ||Al]. A 
similar argument shows that ||J ® B|| = ||B||, and so 


|A® Bll < |A@ 77 @ Bll = |All IBI. 


Meanwhile, by taking a sequence of unit vector ¢, € H, and w, € He 
with ||Ad,|| > || Al] and ||Bw|| > ||B]|, we see that the reverse inequality 
holds, and thus that ||A ®@ B|| = ||Al] || Bl]. 
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